Abstract
Plant diseases pose significant threats to agriculture, making proper diagnosis and effective treatment crucial for protecting crop yields. In automatic diagnosis processing, image segmentation helps to identify and localize diseases. Developing robust image segmentation models for detecting plant diseases requires high-quality annotations. Unfortunately, existing datasets rarely include segmentation labels and are typically confined to controlled laboratory settings, which fail to capture the complexity of images taken in the wild. Motivated by these, we established a large-scale segmentation dataset for plant diseases, dubbed PlantSeg. In particular, PlantSeg is distinct from existing datasets in three key aspects: (1) Annotation types: PlantSeg includes detailed and high-quality disease area masks. (2) Image sources: PlantSeg primarily comprises in-the-wild plant disease images rather than laboratory images provided in existing datasets. (3) Scale: PlantSeg contains the largest number of in-the-wild plant disease images, including 7,774 diseased images with corresponding segmentation masks. This dataset provides an ideal yet unified benchmarking platform for developing advanced plant disease segmentation algorithms.
Data availability
The PlantSeg dataset is available for download at https://doi.org/10.5281/zenodo.17719108.
Code availability
The codes for the baseline reproduction are presented in https://github.com/tqwei05/PlantSeg. The codes benefit from https://github.com/open-mmlab/mmsegmentation, which provides a benchmark toolbox for numerous segmentation methods.
References
Shoaib, M. et al. An advanced deep learning models-based plant disease detection: A review of recent research. Frontiers in Plant Science 14, 1158933 (2023).
Agrios, G. N. Plant pathology (2005).
Shafi, U. et al. Precision agriculture techniques and practices: From considerations to applications. Sensors 19, 3796 (2019).
Chen, L.-C., Papandreou, G., Schroff, F. & Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), 801–818 (2018).
Kirillov, A., Wu, Y., He, K. & Girshick, R. Pointrend: Image segmentation as rendering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9799–9808 (2020).
Guo, M.-H. et al. Segnext: Rethinking convolutional attention design for semantic segmentation. Advances in Neural Information Processing Systems 35, 1140–1156 (2022).
Zhou, B. et al. Semantic understanding of scenes through the ade20k dataset. International Journal of Computer Vision 127, 302–321 (2019).
Cordts, M. et al. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3213–3223 (2016).
Lin, T.-Y. et al. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, 740–755 (Springer, 2014).
Bhatti, M. A. et al. Advanced plant disease segmentation in precision agriculture using optimal dimensionality reduction with fuzzy c-means clustering and deep learning. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2024).
Jafar, A., Bibi, N., Naqvi, R. A., Sadeghi-Niaraki, A. & Jeong, D. Revolutionizing agriculture with artificial intelligence: plant disease detection methods, applications, and their limitations. Frontiers in Plant Science 15, 1356260 (2024).
Wang, D., Wang, J., Li, W. & Guan, P. T-cnn: Trilinear convolutional neural networks model for visual detection of plant diseases. Computers and Electronics in Agriculture 190, 106468 (2021).
Li, J. et al. An improved yolov5-based vegetable disease detection method. Computers and Electronics in Agriculture 202, 107345 (2022).
Xie, X. et al. A deep-learning-based real-time detector for grape leaf diseases using improved convolutional neural networks. Frontiers in plant science 11, 751 (2020).
Savarimuthu, N. et al. Investigation on object detection models for plant disease detection framework. In 2021 IEEE 6th international conference on computing, communication and automation (ICCCA), 214–218 (IEEE, 2021).
Wei, T., Chen, Z. & Yu, X. Snap and diagnose: An advanced multimodal retrieval system for identifying plant diseases in the wild. In Proceedings of the 6th ACM International Conference on Multimedia in Asia, 1-3 (2024).
Shoaib, M. et al. Deep learning-based segmentation and classification of leaf images for detection of tomato plant disease. Frontiers in Plant Science 13, 1031748 (2022).
Hughes, D., Salathé, M. et al. An open access repository of images on plant health to enable the development of mobile disease diagnostics. arXiv preprint arXiv:1511.08060 (2015).
Zhang, Y., Song, C. & Zhang, D. Deep learning-based object detection improvement for tomato disease. IEEE access 8, 56607–56614 (2020).
Şener, A. & Ergen, B. Advanced cnn approach for segmentation of diseased areas in plant images. Journal of Crop Health 76, 1569–1583 (2024).
Prashanth, K., Harsha, J. S., Kumar, S. A. & Srilekha, J. Towards accurate disease segmentation in plant images: A comprehensive dataset creation and network evaluation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 7086–7094 (2024).
Dulloo, M., Hunter, D. & Leaman, D. Plant diversity in addressing food, nutrition and medicinal needs. Novel plant bioresources: applications in food, medicine and cosmetics 1–21 (2014).
Strange, R. N. & Scott, P. R. Plant disease: a threat to global food security. Annu. Rev. Phytopathol. 43, 83–116 (2005).
Sawicka, B., Egbuna, C., Nayak, A. K. & Kala, S. Chapter 2 – plant diseases, pathogens and diagnosis. In Egbuna, C. & Sawicka, B. (eds.) Natural Remedies for Pest, Disease and Weed Control, 17–28, https://doi.org/10.1016/B978-0-12-819304-4.00002-6 (Academic Press, 2020).
Figueroa, M., Hammond-Kosack, K. E. & Solomon, P. S. A review of wheat diseases–a field perspective. Molecular plant pathology 19, 1523–1536 (2018).
Kirillov, A. et al. Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 4015–4026 (2023).
Wei, T. A large-scale in-the-wild dataset for plant disease segmentation, https://doi.org/10.5281/zenodo.17719108 (2024).
Singh, D. et al. Plantdoc: A dataset for visual plant disease detection. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD, 249–253 (2020).
Howard, A. et al. Searching for mobilenetv3. In Proceedings of the IEEE/CVF international conference on computer vision, 1314–1324 (2019).
Zhao, H., Shi, J., Qi, X., Wang, X. & Jia, J. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2881–2890 (2017).
Huang, Z. et al. Ccnet: Criss-cross attention for semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, 603–612 (2019).
Xie, E. et al. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems 34, 12077–12090 (2021).
Bao, H., Dong, L., Piao, S. & Wei, F. Beit: Bert pre-training of image transformers. In International Conference on Learning Representations (2022).
Xu, M., Zhang, Z., Wei, F., Hu, H. & Bai, X. Side adapter network for open-vocabulary semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2945–2954 (2023).
Xiao, T., Liu, Y., Zhou, B., Jiang, Y. & Sun, J. Unified perceptual parsing for scene understanding. In Proceedings of the European conference on computer vision (ECCV), 418–434 (2018).
Liu, Z. et al. A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 11976–11986 (2022).
Yang, L. et al. iShape: A first step towards irregular shape instance segmentation. arXiv preprint arXiv:2109.15068 (2021).
Moupojou, E. et al. Fieldplant: A dataset of field plant images for plant disease detection and classification with deep learning. IEEE Access 11, 35398–35410 (2023).
Wei, T., Chen, Z., Huang, Z. & Yu, X. Benchmarking in-the-wild multimodal plant disease recognition and a versatile baseline. In Proceedings of the 29th ACM international conference on multimedia (2024).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In CVPR (2016).
Dosovitskiy, A. et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (2021).
Liu, Z. et al. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, 10012–10022 (2021).
He, K., Gkioxari, G., Dollár, P. & Girshick, R. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, 2961–2969 (2017).
Acknowledgements
We thank all the people involved in image acquisition, annotation and reviewing. We also thank the people who contributed to this paper. This work was supported by Australian Research Council CE200100025, DP230101196, and Grains Research and Development Corporation UOQ2301-010OPX.
Author information
Authors and Affiliations
Contributions
Tianqi Wei designed the study, built the dataset, conducted experiments and wrote the manuscript. Zhi Chen designed the study, built the dataset and wrote the manuscript. Xin Yu designed the study and revised the manuscript. Scott Chapman and Paul Melloy supervised the annotation process, validated the data and reviewed the manuscript. Zi Huang administrated the project, offered resources and reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Reprints and permissions
About this article
Cite this article
Wei, T., Chen, Z., Yu, X. et al. A Large-Scale In-the-wild Dataset for Plant Disease Segmentation.
Sci Data (2026). https://doi.org/10.1038/s41597-025-06513-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-06513-4
Source: Ecology - nature.com
