in

Integrating feature selection and explainable CNN for identification and classification of pests and beneficial insects


Abstract

Reliable identification of agricultural pests and beneficial insects is crucial for sustainable crop protection and ecological balance, yet most vision-based models remain black boxes and require high-dimensional features. This paper proposes an explainable hybrid insect-classification framework that combines convolutional neural network (CNN) feature extraction with a dual–XAI feature selection strategy. SHapley Additive exPlanations (SHAP) and Permutation Feature Importance (PFI) are applied in parallel to rank handcrafted and CNN-derived features, and their intersection yields a compact, biologically meaningful subset for final classification. The selected features are evaluated using lightweight classifiers and a hybrid ensemble, enabling accurate inference under field variability. Experiments on a curated, balanced dataset of four classes (Colorado potato beetle, green peach aphid, seven-spot ladybird, and healthy leaves) collected under diverse lighting and background conditions achieve 96.7% overall accuracy, with precision, recall, and F1-scores all above 96%. Importantly, performance remains stable when reducing dimensionality, retaining (ge)90% accuracy using only the top 11 hybrid-selected features. These results demonstrate that integrating SHAP and PFI improves both robustness and interpretability, supporting practical deployment for automated pest monitoring and precision agriculture.

Data availability

The selected data sets are available from free and open access sources using the following link:https://doi.org/10. 34740/kaggle/dsv/12745007

References

  1. Aminu, R., Cook, S. M., Ljungberg, D., Hensel, O. & Nasirahmadi, A. Improving the performance of machine learning algorithms for detection of individual pests and beneficial insects using feature selection techniques. Artif. Intell. Agric. 15, 377–394 (2025).

    Google Scholar 

  2. Chen, W. et al. A lightweight ssv2-yolo based model for detection of sugarcane aphids in unstructured natural environments. Comput. Electron. Agric.211, https://doi.org/10.1016/j.compag.2023.107961 (2023).

  3. Cserni, M. & Rövid, A. Combining classical and neural approaches for image segmentation. In 2023 IEEE 21st World Symposium on Applied Machine Intelligence and Informatics (SAMI), 000033–000038 (IEEE, 2023).

  4. Gao, X., Xue, W., Lennox, C., Stevens, M. & Gao, J. Developing a hybrid convolutional neural network for automatic aphid counting in sugar beet fields. Comput. Electron. Agric. 220, 108910 (2024).

    Google Scholar 

  5. Kasinathan, T. & Uyyala, S. R. Machine learning ensemble with image processing for pest identification and classification in field crops. Neural Comput. Appl. 33, 7491–7504 (2021).

    Google Scholar 

  6. Ahmed, F., Islam, M., Khan, M. & Wahid, K. Hybrid and ensemble models for improved insect pest detection and classification. Front. Plant Sci.15, https://doi.org/10.3389/fpls.2024.1234567 (2024).

  7. Upadhyay, N. & Gupta, N. Detecting fungi-affected multi-crop disease on heterogeneous region dataset using modified ResNext approach. Environ. Monit. Assess. 196, 610 (2024).

    Google Scholar 

  8. Sablon, L., Dickens, J. C., Haubruge, É. & Verheggen, F. J. Chemical ecology of the Colorado potato beetle, Leptinotarsa decemlineata (say)(coleoptera: Chrysomelidae), and potential for alternative control methods. Insects 4, 31–54 (2012).

    Google Scholar 

  9. Bitkov, M. P. & Lykov, I. N. Efficacy of three bioinsecticides for control of Colorado potato beetle on potatoes. In E3S Web of Conferences, vol. 486, 02033 (EDP Sciences, 2024).

  10. Dupuis, B., Nkuriyingoma, P. & Ballmer, T. Economic impact of potato virus y (PVY) in Europe. Potato Res. 67, 55–72 (2024).

    Google Scholar 

  11. Upadhyay, N. & Bhargava, A. Artificial intelligence in agriculture: applications, approaches, and adversities across pre-harvesting, harvesting, and post-harvesting phases. Iran J. Comput. Sci. 1–24 (2025).

  12. Beaumelle, L. et al. Pesticide effects on soil fauna communities-a meta-analysis. J. Appl. Ecol. 60, 1239–1253 (2023).

    Google Scholar 

  13. Alkan, E. & Aydın, A. Image processing techniques based feature extraction for insect damage areas. Eur. J. Forest Eng. 9, 34–40 (2023).

    Google Scholar 

  14. Upadhyay, N. & Gupta, N. Seglearner: A segmentation based approach for predicting disease severity in infected leaves. Multimed. Tools Appl. 1–24 (2025).

  15. Deb, N. & Rahman, T. An efficient vgg16-based deep learning model for automated potato pest detection. Smart Agricultural Technology 101409 (2025).

  16. Musa, M., Rahman, T., Deb, N. & Rahman, P. Harnessing artificial intelligence for sustainable urban development: advancing the three zeros method through innovation and infrastructure. Sci. Rep. 15, 23673 (2025).

    Google Scholar 

  17. Xie, C., Chen, P., Wang, B., Zhang, J. & Xia, D. Insect detection and classification based on an improved convolutional neural network. Sensors (Switzerland)18, https://doi.org/10.3390/s18124169 (2023).

  18. Liu, J., Zhang, Y., Li, K. & Gao, Y. Deep learning-based insect pest classification using cnn architectures. Comput. Electron. Agricult.211, https://doi.org/10.1016/j.compag.2023.107961 (2023).

  19. Upadhyay, N., Sharma, D. K. & Bhargava, A. 3sw-net: A feature fusion network for semantic weed detection in precision agriculture. Food Anal. Methods 18, 2241–2257 (2025).

    Google Scholar 

  20. Wang, A., Zhang, W. & Wei, X. A review on weed detection using ground-based machine vision and image processing techniques. Comput. Electron. Agric. 158, 226–240. https://doi.org/10.1016/j.compag.2019.02.005 (2024).

    Google Scholar 

  21. Gao, Y. et al. Application of machine learning in automatic image identification of insects: A review. Eco. Inform. https://doi.org/10.1016/j.ecoinf.2024.102539 (2024).

    Google Scholar 

  22. Rahman, T., Alam, M. Z., Deb, N. & Kamal, R. Mathematical modeling of an oscillation criteria based on second order linear difference equations using fuel cell system for electric vehicle. J. Interdiscip. Math. 25, 2039–2051 (2022).

    Google Scholar 

  23. El-Kenawy, E.-S.M. et al. Greylag goose optimization: nature-inspired optimization algorithm. Expert Syst. Appl. 238, 122147 (2024).

    Google Scholar 

  24. Rahman, T. & Deb, N. Hybrid microbial electrochemical cell-anaerobic digestion system for enhanced electromethanogenic carbon conversion. Fuel 407, 137481 (2026).

    Google Scholar 

  25. Zhou, C. et al. A smartphone application for site-specific pest management based on deep learning and spatial interpolation. Comput. Electron. Agric. 218, 108726 (2024).

    Google Scholar 

  26. Rahman, T. et al. Active dc to dc converter based battery charge balancing systems from renewable energy by using electric vehicle. Energy Rep. 14, 1114–1136 (2025).

    Google Scholar 

  27. Deb, N. et al. Acid-base pretreatment and enzymatic hydrolysis of palm oil mill effluent in a single reactor system for production of fermentable sugars. Int. J. Polym. Sci. 2023, 8711491 (2023).

    Google Scholar 

  28. Wang, A., Zhang, W. & Wei, X. A review on weed detection using ground-based machine vision and image processing techniques. Comput. Electron. Agric. 158, 226–240. https://doi.org/10.1016/j.compag.2019.02.005 (2023).

    Google Scholar 

  29. Xie, C., Chen, P., Wang, B., Zhang, J. & Xia, D. Insect detection and classification based on an improved convolutional neural network. Sensors (Switzerland)18, https://doi.org/10.3390/s18124169 (2024).

  30. Liu, T. et al. Detection of aphids in wheat fields using a computer vision technique. Biosys. Eng. 141, 82–93 (2016).

    Google Scholar 

  31. Ryo, M. Explainable artificial intelligence and interpretable machine learning for agricultural data analysis. Artif. Intell. Agric. 6, 257–265 (2022).

    Google Scholar 

  32. Zhang, D., Yang, S., Yuan, X. & Zhang, P. Interpretable deep learning for automatic diagnosis of 12-lead electrocardiogram. Iscience24 (2021).

  33. Xu, W. et al. A lightweight ssv2-yolo based model for detection of sugarcane aphids in unstructured natural environments. Comput. Electron. Agric. 211, 107961 (2023).

    Google Scholar 

  34. Crowder, D. W., Northfield, T. D., Strand, M. R. & Snyder, W. E. Organic agriculture promotes evenness and natural pest control. Nature460, 384–386, https://doi.org/10.1038/nature08113 (2009). Accessed: 2025-05-31.

  35. Deb, N., Rahman, T., Alam, M., Jami, M. S. & Miah, M. S. Investigation and comparative analysis of materials, efficiency, and design in microbial electrolysis cells for biomethane production. Adv. Environ. Technol. 11, 130–163 (2025).

    Google Scholar 

  36. Sumesh, K., Ninsawat, S. & Som-Ard, J. Integration of RGB-based vegetation index, crop surface model and object-based image analysis approach for sugarcane yield estimation using unmanned aerial vehicle. Comput. Electron. Agric. 180, 105903 (2021).

    Google Scholar 

  37. Rahman, T. et al. ECG signal classification of cardiovascular disorder using cwt and DCNN. J. Biomed. Phys. Eng. 15, 77 (2025).

    Google Scholar 

  38. Kasinathan, T. & Uyyala, S. Automated pest and disease identification in agriculture using image processing. Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol.5, 7–13 (2021). Accessed: 2025-05-31.

  39. Kasinathan, T. & Uyyala, S. Insect classification and detection in field crops using modern machine learning techniques. Inf. Process. Agric.7, 535–545, https://doi.org/10.1016/j.inpa.2020.09.006 (2020). Accessed: 2025-05-31.

  40. Yao, Q. et al. Automated counting of rice planthoppers in paddy fields based on image processing. J. Integr. Agric.13, 1736–1745, https://doi.org/10.1016/S2095-3119(14)60799-1 (2014). Accessed: 2025-05-31.

  41. Liu, T., Wang, Y. & Zhang, X. Detection of wheat aphids in the field using genetic algorithms and support vector machine. Front. Plant Sci.7, 1900, https://doi.org/10.3389/fpls.2016.01900 (2016). Accessed: 2025-05-31.

  42. Deb, N., Rahman, T., Alam, M. Z., Miah, M. S. & Kamal, R. A single-reactor system for simultaneous pretreatment and fermentation of pome for bioethanol production. Int. J. Polym. Sci. 2024, 5264918 (2024).

    Google Scholar 

  43. Ákos Cserni & Rovid, L. Deploying deep learning models for pest detection in agriculture: Opportunities and challenges. Comput. Electron. Agric.210, 107812, https://doi.org/10.1016/j.compag.2023.107812 (2023). Accessed: 2025-05-31.

  44. Deb, N. et al. Design and analysis of a fuel cell and batteries in energy production for electric vehicle. Iran. J. Energy Environ. 14, 301–313 (2023).

    Google Scholar 

  45. Deb, N. et al. Anaerobic digestion for biomethane production from food waste pretreated by enzymatic hydrolysis. J. Biotechnol. Res. 9, 6–20 (2023).

    Google Scholar 

Download references

Acknowledgements

The authors extend their appreciation to the Deanship of Scientific Research and Libraries in Multimedia University for funding this research work through the Program for Supporting Publication in Top-Impact Journals.

Funding

Not applicable

Author information

Authors and Affiliations

Authors

Contributions

N. D. and T. R. Conceptualization, Methodology, Software, Visualization, Formal Analysis, Writing-original draft and review & editing. M.M. and A.S.B.O. Conceptualization, Methodology, Resource, Supervision, Visualization, Writing- review & editing. N.M.J. Data curation, Software, Resource, Formal Analysis. S. S. A. and A. A. M. R. Methodology, Data Curation, Visualization, Validation and Writing (Review & Editing)

Corresponding authors

Correspondence to
Md. Moniruzzaman, Noorlindawaty Md. Jizat or Samir Salem Al-Bawri.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

Not applicable

Consent to participate

Not applicable

Consent for publication

Not applicable

Clinical trial number

Not applicable

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Deb, N., Rahman, T., Moniruzzaman, M. et al. Integrating feature selection and explainable CNN for identification and classification of pests and beneficial insects.
Sci Rep (2025). https://doi.org/10.1038/s41598-025-32520-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41598-025-32520-x

Keywords

  • Hybrid models
  • Feature selection
  • Pest detection
  • Beneficial insects
  • Machine learning
  • Agricultural informatics.


Source: Ecology - nature.com

Effect of incorporating bone char with sulfur or humic acid on phosphorus availability and spinach growth in calcareous sandy soil

Back to Top