Abstract
Reliable identification of agricultural pests and beneficial insects is crucial for sustainable crop protection and ecological balance, yet most vision-based models remain black boxes and require high-dimensional features. This paper proposes an explainable hybrid insect-classification framework that combines convolutional neural network (CNN) feature extraction with a dual–XAI feature selection strategy. SHapley Additive exPlanations (SHAP) and Permutation Feature Importance (PFI) are applied in parallel to rank handcrafted and CNN-derived features, and their intersection yields a compact, biologically meaningful subset for final classification. The selected features are evaluated using lightweight classifiers and a hybrid ensemble, enabling accurate inference under field variability. Experiments on a curated, balanced dataset of four classes (Colorado potato beetle, green peach aphid, seven-spot ladybird, and healthy leaves) collected under diverse lighting and background conditions achieve 96.7% overall accuracy, with precision, recall, and F1-scores all above 96%. Importantly, performance remains stable when reducing dimensionality, retaining (ge)90% accuracy using only the top 11 hybrid-selected features. These results demonstrate that integrating SHAP and PFI improves both robustness and interpretability, supporting practical deployment for automated pest monitoring and precision agriculture.
Data availability
The selected data sets are available from free and open access sources using the following link:https://doi.org/10. 34740/kaggle/dsv/12745007
References
Aminu, R., Cook, S. M., Ljungberg, D., Hensel, O. & Nasirahmadi, A. Improving the performance of machine learning algorithms for detection of individual pests and beneficial insects using feature selection techniques. Artif. Intell. Agric. 15, 377–394 (2025).
Chen, W. et al. A lightweight ssv2-yolo based model for detection of sugarcane aphids in unstructured natural environments. Comput. Electron. Agric.211, https://doi.org/10.1016/j.compag.2023.107961 (2023).
Cserni, M. & Rövid, A. Combining classical and neural approaches for image segmentation. In 2023 IEEE 21st World Symposium on Applied Machine Intelligence and Informatics (SAMI), 000033–000038 (IEEE, 2023).
Gao, X., Xue, W., Lennox, C., Stevens, M. & Gao, J. Developing a hybrid convolutional neural network for automatic aphid counting in sugar beet fields. Comput. Electron. Agric. 220, 108910 (2024).
Kasinathan, T. & Uyyala, S. R. Machine learning ensemble with image processing for pest identification and classification in field crops. Neural Comput. Appl. 33, 7491–7504 (2021).
Ahmed, F., Islam, M., Khan, M. & Wahid, K. Hybrid and ensemble models for improved insect pest detection and classification. Front. Plant Sci.15, https://doi.org/10.3389/fpls.2024.1234567 (2024).
Upadhyay, N. & Gupta, N. Detecting fungi-affected multi-crop disease on heterogeneous region dataset using modified ResNext approach. Environ. Monit. Assess. 196, 610 (2024).
Sablon, L., Dickens, J. C., Haubruge, É. & Verheggen, F. J. Chemical ecology of the Colorado potato beetle, Leptinotarsa decemlineata (say)(coleoptera: Chrysomelidae), and potential for alternative control methods. Insects 4, 31–54 (2012).
Bitkov, M. P. & Lykov, I. N. Efficacy of three bioinsecticides for control of Colorado potato beetle on potatoes. In E3S Web of Conferences, vol. 486, 02033 (EDP Sciences, 2024).
Dupuis, B., Nkuriyingoma, P. & Ballmer, T. Economic impact of potato virus y (PVY) in Europe. Potato Res. 67, 55–72 (2024).
Upadhyay, N. & Bhargava, A. Artificial intelligence in agriculture: applications, approaches, and adversities across pre-harvesting, harvesting, and post-harvesting phases. Iran J. Comput. Sci. 1–24 (2025).
Beaumelle, L. et al. Pesticide effects on soil fauna communities-a meta-analysis. J. Appl. Ecol. 60, 1239–1253 (2023).
Alkan, E. & Aydın, A. Image processing techniques based feature extraction for insect damage areas. Eur. J. Forest Eng. 9, 34–40 (2023).
Upadhyay, N. & Gupta, N. Seglearner: A segmentation based approach for predicting disease severity in infected leaves. Multimed. Tools Appl. 1–24 (2025).
Deb, N. & Rahman, T. An efficient vgg16-based deep learning model for automated potato pest detection. Smart Agricultural Technology 101409 (2025).
Musa, M., Rahman, T., Deb, N. & Rahman, P. Harnessing artificial intelligence for sustainable urban development: advancing the three zeros method through innovation and infrastructure. Sci. Rep. 15, 23673 (2025).
Xie, C., Chen, P., Wang, B., Zhang, J. & Xia, D. Insect detection and classification based on an improved convolutional neural network. Sensors (Switzerland)18, https://doi.org/10.3390/s18124169 (2023).
Liu, J., Zhang, Y., Li, K. & Gao, Y. Deep learning-based insect pest classification using cnn architectures. Comput. Electron. Agricult.211, https://doi.org/10.1016/j.compag.2023.107961 (2023).
Upadhyay, N., Sharma, D. K. & Bhargava, A. 3sw-net: A feature fusion network for semantic weed detection in precision agriculture. Food Anal. Methods 18, 2241–2257 (2025).
Wang, A., Zhang, W. & Wei, X. A review on weed detection using ground-based machine vision and image processing techniques. Comput. Electron. Agric. 158, 226–240. https://doi.org/10.1016/j.compag.2019.02.005 (2024).
Gao, Y. et al. Application of machine learning in automatic image identification of insects: A review. Eco. Inform. https://doi.org/10.1016/j.ecoinf.2024.102539 (2024).
Rahman, T., Alam, M. Z., Deb, N. & Kamal, R. Mathematical modeling of an oscillation criteria based on second order linear difference equations using fuel cell system for electric vehicle. J. Interdiscip. Math. 25, 2039–2051 (2022).
El-Kenawy, E.-S.M. et al. Greylag goose optimization: nature-inspired optimization algorithm. Expert Syst. Appl. 238, 122147 (2024).
Rahman, T. & Deb, N. Hybrid microbial electrochemical cell-anaerobic digestion system for enhanced electromethanogenic carbon conversion. Fuel 407, 137481 (2026).
Zhou, C. et al. A smartphone application for site-specific pest management based on deep learning and spatial interpolation. Comput. Electron. Agric. 218, 108726 (2024).
Rahman, T. et al. Active dc to dc converter based battery charge balancing systems from renewable energy by using electric vehicle. Energy Rep. 14, 1114–1136 (2025).
Deb, N. et al. Acid-base pretreatment and enzymatic hydrolysis of palm oil mill effluent in a single reactor system for production of fermentable sugars. Int. J. Polym. Sci. 2023, 8711491 (2023).
Wang, A., Zhang, W. & Wei, X. A review on weed detection using ground-based machine vision and image processing techniques. Comput. Electron. Agric. 158, 226–240. https://doi.org/10.1016/j.compag.2019.02.005 (2023).
Xie, C., Chen, P., Wang, B., Zhang, J. & Xia, D. Insect detection and classification based on an improved convolutional neural network. Sensors (Switzerland)18, https://doi.org/10.3390/s18124169 (2024).
Liu, T. et al. Detection of aphids in wheat fields using a computer vision technique. Biosys. Eng. 141, 82–93 (2016).
Ryo, M. Explainable artificial intelligence and interpretable machine learning for agricultural data analysis. Artif. Intell. Agric. 6, 257–265 (2022).
Zhang, D., Yang, S., Yuan, X. & Zhang, P. Interpretable deep learning for automatic diagnosis of 12-lead electrocardiogram. Iscience24 (2021).
Xu, W. et al. A lightweight ssv2-yolo based model for detection of sugarcane aphids in unstructured natural environments. Comput. Electron. Agric. 211, 107961 (2023).
Crowder, D. W., Northfield, T. D., Strand, M. R. & Snyder, W. E. Organic agriculture promotes evenness and natural pest control. Nature460, 384–386, https://doi.org/10.1038/nature08113 (2009). Accessed: 2025-05-31.
Deb, N., Rahman, T., Alam, M., Jami, M. S. & Miah, M. S. Investigation and comparative analysis of materials, efficiency, and design in microbial electrolysis cells for biomethane production. Adv. Environ. Technol. 11, 130–163 (2025).
Sumesh, K., Ninsawat, S. & Som-Ard, J. Integration of RGB-based vegetation index, crop surface model and object-based image analysis approach for sugarcane yield estimation using unmanned aerial vehicle. Comput. Electron. Agric. 180, 105903 (2021).
Rahman, T. et al. ECG signal classification of cardiovascular disorder using cwt and DCNN. J. Biomed. Phys. Eng. 15, 77 (2025).
Kasinathan, T. & Uyyala, S. Automated pest and disease identification in agriculture using image processing. Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol.5, 7–13 (2021). Accessed: 2025-05-31.
Kasinathan, T. & Uyyala, S. Insect classification and detection in field crops using modern machine learning techniques. Inf. Process. Agric.7, 535–545, https://doi.org/10.1016/j.inpa.2020.09.006 (2020). Accessed: 2025-05-31.
Yao, Q. et al. Automated counting of rice planthoppers in paddy fields based on image processing. J. Integr. Agric.13, 1736–1745, https://doi.org/10.1016/S2095-3119(14)60799-1 (2014). Accessed: 2025-05-31.
Liu, T., Wang, Y. & Zhang, X. Detection of wheat aphids in the field using genetic algorithms and support vector machine. Front. Plant Sci.7, 1900, https://doi.org/10.3389/fpls.2016.01900 (2016). Accessed: 2025-05-31.
Deb, N., Rahman, T., Alam, M. Z., Miah, M. S. & Kamal, R. A single-reactor system for simultaneous pretreatment and fermentation of pome for bioethanol production. Int. J. Polym. Sci. 2024, 5264918 (2024).
Ákos Cserni & Rovid, L. Deploying deep learning models for pest detection in agriculture: Opportunities and challenges. Comput. Electron. Agric.210, 107812, https://doi.org/10.1016/j.compag.2023.107812 (2023). Accessed: 2025-05-31.
Deb, N. et al. Design and analysis of a fuel cell and batteries in energy production for electric vehicle. Iran. J. Energy Environ. 14, 301–313 (2023).
Deb, N. et al. Anaerobic digestion for biomethane production from food waste pretreated by enzymatic hydrolysis. J. Biotechnol. Res. 9, 6–20 (2023).
Acknowledgements
The authors extend their appreciation to the Deanship of Scientific Research and Libraries in Multimedia University for funding this research work through the Program for Supporting Publication in Top-Impact Journals.
Funding
Not applicable
Author information
Authors and Affiliations
Contributions
N. D. and T. R. Conceptualization, Methodology, Software, Visualization, Formal Analysis, Writing-original draft and review & editing. M.M. and A.S.B.O. Conceptualization, Methodology, Resource, Supervision, Visualization, Writing- review & editing. N.M.J. Data curation, Software, Resource, Formal Analysis. S. S. A. and A. A. M. R. Methodology, Data Curation, Visualization, Validation and Writing (Review & Editing)
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
Not applicable
Consent to participate
Not applicable
Consent for publication
Not applicable
Clinical trial number
Not applicable
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Reprints and permissions
About this article
Cite this article
Deb, N., Rahman, T., Moniruzzaman, M. et al. Integrating feature selection and explainable CNN for identification and classification of pests and beneficial insects.
Sci Rep (2025). https://doi.org/10.1038/s41598-025-32520-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-32520-x
Keywords
- Hybrid models
- Feature selection
- Pest detection
- Beneficial insects
- Machine learning
- Agricultural informatics.
Source: Ecology - nature.com
