Abstract
Effective insect detection is crucial for sustainable cotton production, yet traditional monitoring methods remain labor-intensive, inefficient, and environmentally detrimental. This study introduces Enhanced YOLO12, a novel deep learning architecture for real-time cotton insect detection. Building on the YOLO12 framework, the proposed model integrates an optimized Spatial Pyramid Pooling (SPP) module and attention-based feature extraction to improve detection accuracy while maintaining computational efficiency. To ensure robustness, we developed and evaluated multiple baseline models (standard YOLO11 and YOLO12) and custom architectures (YOLO12_Fusion, YOLO11-BRA-Net, YOLO11_CBAM, and Enhanced Hybrid YOLO12). According to the conducted experiments, Enhanced Hybrid YOLO12 achieved the best performance, achieving 0.942, 0.876, 0.945, and 0.735 in precision, recall, mAP50 and mAP50-95, respectively. It significantly outstands the results of the standard YOLO12 (0.925, 0.848, of 0.913, and 0.662). These results demonstrate that Enhanced Hybrid YOLO12 can be considered as a state-of-the-art framework for precision agriculture, with its high detection accuracy and real-time capability. Therefore, they encourage this deep learning model in pest management applications.
Data availability
The dataset is publicly available and can be accessed at the following link: [https://www.scidb.cn/en/detail? dataSetId=3f36bce8e41849a6a33e34fb0f8ae581](https:/www.scidb.cn/en/detail? dataSetId=3f36bce8e41849a6a33e34fb0f8ae581).
Code availability
The custom code used in this study to generate and analyse the results is publicly available in a GitHub repository at https://github.com/DrDinaSaif/Enhanced-YOLO-12.
Abbreviations
- A2C2F:
Attention mechanism with C2F block
- AI:
Artificial intelligence
- AELGNet:
Attention-based enhanced local and global features network
- BERT-ResNet-PSO:
Bidirectional encoder representations from transformers-residual network-particle swarm optimization
- BiFormerAF:
BiFormer attention fusion
- BRA:
Bi-level routing attention
- C2F:
Cross stage partial bottleneck with 2 convolutional layers
- C2PSA:
Convolutional block with parallel spatial attention
- C3K:
Conv ×3 with kernel
- C3K2:
Conv ×3 with kernel size 2
- CAM:
Channel attention module
- CBAM:
Convolutional block attention module
- CFNet-VoVGCSP-LSKNet-YOLOv8s:
Cross-feature network- VoVNet with ghost convolutional structure and spatial pyramid- large kernel attention network- you only look once version 8
- CNN:
Convolutional neural network
- COLAB:
Google collaboratory
- C2PSA:
Cross-stage partial self-attention
- CSP:
Cross stage partial
- DenseNet121:
Densely connected convolutional network with 121 layers
- DL:
Deep learning
- ECENet:
EfficientNet model
- InceptionResNetV2:
Inception residual network version 2
- F-measure:
Fisher score-measure
- FLOPS:
Floating point operations per second
- FM:
Feature map
- FM-SR:
Feature map-super resolution
- FN:
False negative
- FP:
False positive
- GCSP:
Ghost CSP
- GFLOPS:
Giga floating-point operations per second
- mAP:
Mean average precision
- ML:
Machine learning
- MSP2P:
Multi-scale patch-to-patch
- ResNet50:
Residual network with 50 layers
- SAM:
Spatial attention module
- SPPF:
Spatial pyramid pooling fast
- SpemNet:
Stacking patch embedding network
- SRNet-YOLO:
Super-resolution network-you only look once
- TN:
True negative
- TP:
True positive
- TXT:
Text format
- ViT:
Vision transformer
- VGG16:
Visual geometry group neural network with 16 layers
- XML:
Extensible markup language
- YOLO:
You only look once
- YOLOv8-MDN-Tiny:
YOLO version 8 with mixed density network-tiny
- YOLO11:
YOLO version 11
- YOLO11-BRA-Net:
YOLO11 with bi-level routing attention network
- YOLO11_CBAM:
YOLO11 with convolutional block attention module
- YOLO12:
YOLO version 12
- YOLO12_Fusion:
YOLO12_Fusion
References
Benjamin, J. et al. Cereal production in Africa: The threat of certain pests and weeds in a changing climate—A review. Agric. Food Secur. 13 (1), 18 (2024).
Askr, H., Moawad, M., Darwish, A. & Hassanien, A. E. Multiclass deep learning model for predicting lung diseases based on honey Badger algorithm. Int. J. Inform. Technol. 17 (2), 1147–1154 (2025).
Askr, H., Hssanien, A. E. & Darwish, A. Prediction of climate change impact based on air flight CO2 emissions using machine learning: towards green air flights. In The Power of Data: Driving Climate Change with Data Science and Artificial Intelligence Innovations. 27–37 (Springer, 2023).
Farrag, T. A., Askr, H., Elhosseini, M. A., Hassanien, A. E. & Farag, M. A. Intelligent parcel delivery scheduling using truck-drones to cut down time and cost. Drones 8(9), 477 (2024).
Askr, H., Gomaa, M. M., Rizk-Allah, R. M., Snasel, V. & Hassanien, A. E. Prediction of methane emission and electricity generation from landfills: Deep learning approach. Energy Rep. 12, 5462–5472 (2024).
Askr, H., El-dosuky, M., Darwish, A. & Hassanien, A. E. Explainable ResNet50 learning model based on copula entropy for cotton plant disease prediction. Appl. Soft Comput. 164, 112009 (2024).
Wang, J., Chen, Y., Huang, J., Jiang, X. & Wan, K. Leveraging machine learning for advancing insect pest control: A bibliometric analysis. J. Appl. Entomol. 149 (3), 293–308 (2025).
Moussa, F., Askr, H. & Hassanien, A. E. Coconut detection using deep learning: Towards sustainable, and renewable biodiesel production. In International Conference on Advanced Intelligent Systems and Informatics. 429–439. (Springer, 2025).
Alves, A. N., Souza, W. S. & Borges, D. L. Cotton pests classification in field-based images using deep residual networks. Comput. Electron. Agric. 174, 105488 (2020).
Bai, M. et al. A point-based method for identification and counting of tiny object insects in cotton fields. Comput. Electron. Agric. 227, 109648 (2024).
Li, R. et al. Identification of cotton pest and disease based on CFNet-VoV-GCSP-LSKNet-YOLOv8s: A new era of precision agriculture. Front. Plant Sci. 15, 1348402 (2024).
Yang, S., Zhou, G., Feng, Y., Zhang, J. & Jia, Z. SRNet-YOLO: A model for detecting tiny and very tiny pests in cotton fields based on super-resolution reconstruction. Front. Plant Sci. 15, 1416940 (2024).
Liu, J., Sun, L., Zhou, G., Wang, J. & Xing, J. Sfce-Vt: Spatial Feature Fusion and Contrast-Enhanced Visual Transformer for Fine-Grained Agricultural Pests Visual Classification.
Qiu, K. et al. SpemNet: A cotton disease and pest identification method based on efficient multi-scale attention and stacking patch embedding. Insects 15(9), 667 (2024).
Sharma, S. & Vardhan, M. AELGNet: Attention-based enhanced local and global features network for medicinal leaf and plant classification. Comput. Biol. Med. 184, 109447 (2025).
Chen, D. et al. YOLOv8-MDN-Tiny: A lightweight model for multi-scale disease detection of postharvest golden passion fruit. Postharvest Biol. Technol. 219, 113281 (2025).
Dheeraj, A. & Chand, S. Deep learning based weed classification in corn using improved attention mechanism empowered by explainable AI techniques. Crop Prot. 190, 107058 (2025).
Chavan, P., Chavan, P. P. & Chavan, A. Hybrid architecture for crop detection and leaf disease detection with improved U-Net segmentation model and image processing. Crop Prot. 190, 107117 (2025).
Pavate, A. et al. Efficient model for cotton plant health monitoring via YOLO-based disease prediction. Indonesian J. Electr. Eng. Comput. Sci. 37 (1), 164–178 (2025).
Madhu, S. & RaviSankar, V. Comprehensive analysis of a YOLO-based deep learning model for cotton plant leaf disease detection. Eng. Technol. Appl. Sci. Res. 15 (1), 19947–19952 (2025).
Kumar, R. et al. Hybrid approach of cotton disease detection for enhanced crop health and yield. (IEEE Access, 2024).
Singh, C., Wibowo, S. & Grandhi, A. P. S. A hybrid deep learning approach for cotton plant disease detection using Bert-Resnet-Pso. Available at SSRN 5113751.
Ciccone, F. & Ceruti, A. Real-time search and rescue with drones: A deep learning approach for small-object detection based on YOLO. Drones 9(8), 514 (2025).
Giri, K. J. SO-YOLOv8: A novel deep learning-based approach for small object detection with YOLO beyond COCO. Expert Syst. Appl. 280, 127447 (2025).
Zhang, H. et al. A transformer-based detection network for precision cistanche pest and disease management in smart agriculture. Plants 14(4), 499 (2025).
Pushpa, B. & Rani, N. S. Dataset for Indian medicinal plant species analysis and recognition. Data Brief. 49, 109388 (2023).
Jiang, H. et al. CNN feature based graph convolutional network for weed and crop recognition in smart farming. Comput. Electron. Agric. 174, 105450 (2020).
Chavan, P., Chavan, P. P. & Chavan, A. Hybrid architecture for crop detection and leaf disease detection with improved U-Net segmentation model and image processing. Crop Protect. 107117 (2025).
Plantvillage-dataset. https://www.kaggle.com/datasets/abdallahalide v/plantvillage-dataset. Accessed 2025 (2025).
Plantvillage-dataset. Collected dataset. www.kaggle.com/datasets. Accessed 2025 (2025).
Cotton leaf disease dataset. https://www.kaggle.com/datasets/seroshkarim/cotton-leaf-diseasedataset. Accessed 2025 (2025).
Noon, S. K., Amjad, M., Qureshi, M. A. & Mannan, A. Computationally light deep learning framework to recognize cotton leaf diseases. J. Intell. Fuzzy Syst. 40 (6), 12383–12398 (2021).
Cotton-plant-disease. https://www.kaggle.com/datasets/dhamur/cotton-plant-disease/data. Accessed 2024 (2024).
Zhu, P. et al. Visdrone-det2018: The vision meets drone object detection in image challenge results. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops. (2018).
Everingham, M. & Winn, J. The Pascal visual object classes challenge 2012 (voc2012) development kit. In Pattern Analysis, Statistical Modelling and Computational Learning, Technical Report. Vol. 8(5). 2–5 (2011).
Shafik, W., Tufail, A., De Silva, L. C., A., R. A. & Apong, H. M. A lightweight deep learning model for multi-plant biotic stress classification and detection for sustainable agriculture. Sci. Rep. 15 (1), 12195 (2025).
Shafik, W. et al. A comprehensive dataset of agarwood tree (Aquilaria Malaccensis) leaf images for disease analysis in Brunei Darussalam. Data Brief 112227 (2025).
https://doi.org/10.1016/j.compag.2020.105488 (2025). accessed 18-1-2025.
Deserno, M. & Briassouli, A. Faster r-CNN and EfficientNet for accurate insect identification in a relabeled yellow sticky traps dataset. In 2021 IEEE International Workshop on Metrology for Agriculture and Forestry (MetroAgriFor). 209–214 (IEEE, 2021).
Customized-cotton-diseasedataset. https:// www.kaggle.com/datasets/saeedazfar/customized-cotton-diseasedataset. Accessed 30 Jan 2025 (2025).
Cotton-crop-disease-detection. https://www.kaggle.com/datasets/paridhijain02122001/ cotton-crop-disease-detection. Accessed 30 Jan 2025 (2025).
Wah, C., Branson, S., Welinder, P., Perona, P. & Belongie, S. The caltech-ucsd birds-200-2011 dataset (2011).
Wu, X., Zhan, C., Lai, Y. K., Cheng, M. M. & Yang, J. Ip102: A large-scale benchmark dataset for insect pest recognition. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8787–8796 (2019).
Lei, W. C. & LiBo, L. A dataset of image-text cross-modal retrieval of Lycium barbarum pests in ningxia in 2020. Trans. Assoc. Comput. Linguist. 7 (2022).
Yang, M., Chen, Y. & Li, Y. C. An image dataset for cotton field insect identification study. Sci. Data Bank (2023).
Liu, C., Tao, Y., Liang, J., Li, K. & Chen, Y. Object detection based on YOLO network. In 2018 IEEE 4th Information Technology and Mechatronics Engineering Conference (ITOEC). 799–803. (IEEE, 2018).
Zhuo, S. et al. SCL-YOLOv11: A lightweight object detection network for low-illumination environments. IEEE Access. 13, 47653–47662 (2025).
Pan, J., Xu, S., Cheng, Z. & Lian, S. C2F-YOLO: A coarse-to-fine object detection framework based on YOLO. In Proceedings of the 2024 3rd Asia Conference on Algorithms, Computing and Machine Learning. 150–157 (2024).
Alif, M. A. R. Yolov11 for vehicle detection: Advancements, performance, and applications in intelligent transportation systems. arXiv preprint arXiv:2410.22898 (2024).
Khanam, R. & Hussain, M. Yolov11: An overview of the key architectural enhancements, arXiv preprint arXiv:2410.17725 (2024).
Ji, Y. et al. Improved YOLO11 algorithm for insulator defect detection in power distribution lines. Electronics 14(6), 1201 (2025).
Alkhammash, E. H. Multi-classification using YOLOv11 and hybrid YOLO11n-MobileNet models: A fire classes case study. Fire 8(1), 17 (2025).
Tian, Y., Ye, Q. & Doermann, D. Yolov12: Attention-centric real-time object detectors. arXiv preprint arXiv:2502.12524 (2025).
Kljucaric, L. & George, A. D. Deep learning inferencing with high-performance hardware accelerators. ACM Trans. Intell. Syst. Technol. 14 (4), 1–25 (2023).
Khan, T. et al. Machine learning (ML)-centric resource management in cloud computing: A review and future directions. J. Netw. Comput. Appl. 204, 103405 (2022).
Gao, Y. et al. Estimating GPU memory consumption of deep learning models. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1342–1352 (2020).
Najafabadi, M. M. et al. Deep learning applications and challenges in big data analytics. J. Big Data. 2, 1–21 (2015).
Sokolova, M. & Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45 (4), 427–437 (2009).
Tharwat, A. Classification assessment methods. Appl. Comput. Inf. 17 (1), 168–192 (2021).
Phalke, S., Vaidya, Y. & Metkar, S. Big-O time complexity analysis of algorithm. In 2022 International Conference on Signal and Information Processing (IConSIP). 1–5. (IEEE, 2022).
Funding
Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).
Author information
Authors and Affiliations
Contributions
Aboul Ella Hassanein: Idea, Senior administration, Supervision, Validation and Writing – review and editing. Amany Sarhan: Supervision, Investigation, Writing- Reviewing and Validation. Dina Sief: Data curation, Investigation, Writing-results, Visualization and Software. Heba Askr: Methodology, Validation, Writing –original draft, and Writing – review and editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Reprints and permissions
About this article
Cite this article
Saif, D., Askr, H., Sarhan, A.M. et al. Enhanced YOLO12 with spatial pyramid pooling for real-time cotton insect detection.
Sci Rep (2026). https://doi.org/10.1038/s41598-026-35747-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-35747-4
Keywords
- Cotton insect detection
- YOLO12
- Deep learning
- Insect management
- Object detection
- Precision agriculture
- Sustainability
Source: Ecology - nature.com
