Efficient and scalable training set generation for automated pollen monitoring with Hirst-type samplers

Abstract

Automated pollen detection is essential for ecological monitoring, allergy forecasting, and biodiversity research. However, existing methods rely heavily on manual or semi-automated annotations, limiting scalability and broader applicability. We introduce a highly automated training dataset generation pipeline that combines one-shot detection with systematic refinement, producing tens of thousands of high-quality annotations from bright-field microscopy while significantly reducing manual effort and annotation costs. Using multi-regional datasets from France, Hungary, and Sweden, we trained object detection models on seven pollen taxa and evaluated their performance on both external pure and mixed species slides and real-world airborne samples. We assessed the reusability of pretrained vision models for pollen detection, aiming to reduce the need for extensive retraining. Using linear probing, we identified foundational Vision Transformers (ViTs) as the most effective feature extractors and integrated them into Faster R-CNN detection models. We benchmarked these models against ResNet50, a widely adopted backbone in biological imaging. On held-out regions of the training datasets, our models achieved high performance in both classification and detection tasks. On independent reference slides from other datasets, ViTs continued to outperform ResNet50 in classification. However, in full object detection and under real deployment conditions, ResNet50-based models remained competitive and achieved the highest accuracy for detecting Ambrosia, a major allergen with public health significance. Cross-dataset generalization remains a challenge, underscoring the need for domain adaptation techniques such as stain normalization and data augmentation. This study establishes a scalable framework for AI-assisted pollen monitoring, supporting large-scale slide digitization and enabling applications in long-term ecological research, allergen surveillance, and automated biodiversity assessment.

Neural networks for increased accuracy of allergenic pollen monitoring

Article
Open access
31 May 2021

Explainable AI for unveiling deep learning pollen classification model based on fusion of scattered light patterns and fluorescence spectroscopy

Article
Open access
24 February 2023

Automated tick classification using deep learning and its associated challenges in citizen science

Article
Open access
10 July 2025

Data availability

All datasets used in this study, including digitized slides and corresponding hand annotations are available upon request. For additional information or specific requests, please contact the corresponding author.

Code availability

The source code utilized in this study is available from our GitHub repository at https://github.com/abiricz/pollen-auto-annot-init-paper. This repository includes all scripts and comprehensive documentation required to replicate the experiments and evaluations described in this work.

References

Bastl, K., Berger, U. & Kmenta, M. Evaluation of pollen apps forecasts: The need for quality control in an ehealth service. J. Med. Internet Res. 19, e152 (2017).
Google Scholar
Smith, M., Cecchi, L., Skjøth, C., Karrer, G. & Šikoparija, B. Common ragweed: A threat to environmental health in europe. Environ. Int. 61, 115–126. https://doi.org/10.1016/j.envint.2013.08.005 (2013).
Google Scholar
Bourel, B. et al. Automated recognition by multiple convolutional neural networks of modern, fossil, intact and damaged pollen grains. Comput. & Geosci. 140, 104498. https://doi.org/10.1016/j.cageo.2020.104498 (2020).
Google Scholar
Bertrand, C. et al. Seasonal shifts and complementary use of pollen sources by two bees, a lacewing and a ladybeetle species in european agricultural landscapes. J. Appl. Ecol. 56, 2431–2442, https://doi.org/10.1111/1365-2664.13483 (2019). https://besjournals.onlinelibrary.wiley.com/doi/pdf/10.1111/1365-2664.13483.
Barnes, C. S. Impact of climate change on pollen and respiratory disease. Curr. Allergy Asthma Reports 18, 59. https://doi.org/10.1007/s11882-018-0813-7 (2018).
Google Scholar
Galán, C. et al. Pollen monitoring: minimum requirements and reproducibility of analysis. Aerobiologia 30, 385–395. https://doi.org/10.1007/s10453-014-9335-5 (2014).
Google Scholar
Halbritter, H. et al. Illustrated Pollen Terminology (2018).
Holt, K. A. & Bennett, K. D. Principles and methods for automated palynology. New Phytol. 203, 735–742, https://doi.org/10.1111/nph.12848 (2014). https://nph.onlinelibrary.wiley.com/doi/pdf/10.1111/nph.12848.
Holt, K., Allen, G., Hodgson, R., Marsland, S. & Flenley, J. Progress towards an automated trainable pollen location and classifier system for use in the palynology laboratory. Rev. Palaeobot. Palynol. 167, 175–183. https://doi.org/10.1016/j.revpalbo.2011.08.006 (2011).
Google Scholar
Clot, B. et al. The eumetnet autopollen programme: establishing a prototype automatic pollen monitoring network in europe. Aerobiologia 40, 3–11. https://doi.org/10.1007/s10453-020-09666-4 (2024).
Google Scholar
Sauvageat, E. et al. Real-time pollen monitoring using digital holography. Atmospheric Meas. Tech. 13, 1539–1550. https://doi.org/10.5194/amt-13-1539-2020 (2020).
Google Scholar
Oteros, J. et al. An operational robotic pollen monitoring network based on automatic image recognition. Environ. Res. 191, 110031. https://doi.org/10.1016/j.envres.2020.110031 (2020).
Google Scholar
Grant-Jacob, J. A., Praeger, M., Eason, R. W. & Mills, B. In-flight sensing of pollen grains via laser scattering and deep learning. Eng. Res. Express 3, 025021. https://doi.org/10.1088/2631-8695/abfdf8 (2021).
Google Scholar
Cholleton, D. et al. Laboratory evaluation of the scattering matrix of ragweed, ash, birch and pine pollen towards pollen classification. Atmospheric Meas. Tech. 15, 1021–1032. https://doi.org/10.5194/amt-15-1021-2022 (2022).
Google Scholar
Jardine, P. E., Gosling, W. D., Lomax, B. H., Julier, A. C. M. & Fraser, W. T. Chemotaxonomy of domesticated grasses: a pathway to understanding the origins of agriculture. J. Micropalaeontology 38, 83–95. https://doi.org/10.5194/jm-38-83-2019 (2019).
Google Scholar
Dunker, S. et al. Pollen analysis using multispectral imaging flow cytometry and deep learning. New Phytol. 229, 593–606, https://doi.org/10.1111/nph.16882 (2021). https://nph.onlinelibrary.wiley.com/doi/pdf/10.1111/nph.16882.
Lang, D., Tang, M., Hu, J. & Zhou, X. Genome-skimming provides accurate quantification for pollen mixtures. Mol. Ecol.Resour. 19, 1433–1446, https://doi.org/10.1111/1755-0998.13061 (2019). https://onlinelibrary.wiley.com/doi/pdf/10.1111/1755-0998.13061.
Viertel, P. & König, M. Pattern recognition methodologies for pollen grain image classification: a survey. Mach. Vis. Appl. 33, 18. https://doi.org/10.1007/s00138-021-01271-w (2022).
Google Scholar
HIRST, J. M. An automatic volumetric spore trap. Annals Appl. Biol. 39, 257–265 https://doi.org/10.1111/j.1744-7348.1952.tb00904.x (1952). https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1744-7348.1952.tb00904.x.
Thibaudon, M. The allergy risk associated with pollens in france [risque allergigue lié aux pollens en france]. Eur. AnnalsAllergy Clin. Immunol. 35, 170–172 (2003).
Google Scholar
Lind, T. et al. Pollen season trends (1973–2013) in stockholm area, sweden. PLOS ONE 11, e0166887. https://doi.org/10.1371/journal.pone.0166887 (2016).
Google Scholar
Jocher, G., Chaurasia, A. & Qiu, J. Ultralytics YOLO (2023).
Ren, S., He, K., Girshick, R. & Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Cortes, C., Lawrence, N., Lee, D., Sugiyama, M. & Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28 (Curran Associates, Inc., 2015).
Ren, S., He, K., Girshick, R. & Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031 (2017).
Google Scholar
Li, Y. et al. Benchmarking detection transfer learning with vision transformers. ArXiv arXiv:2111.11429 (2021).
Li, B., Li, J., Zhu, Z., Zhao, L. & Cheng, W. A deep learning based method for microscopic object localization and classification. In 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), 1426–1431, https://doi.org/10.1109/COMPSAC54236.2022.00226 (2022).
Kubera, E. et al. Towards automation of pollen monitoring: Image-based tree pollen recognition. In (eds Ceci, M., Flesca, S., Masciari, E., Manco, G. & Raś, Z. W.) Foundations of Intelligent Systems, 219–229 (Springer International Publishing, Cham, 2022).
Gallardo, R. et al. Automated multifocus pollen detection using deep learning. Multimed. Tools Appl. https://doi.org/10.1007/s11042-024-18450-2 (2024).
Google Scholar
Chaves, A. J. et al. Pollen recognition through an open-source web-based system: automated particle counting for aerobiological analysis. Earth Sci. Inform. 17, 699–710. https://doi.org/10.1007/s12145-023-01189-z (2024).
Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778, https://doi.org/10.1109/CVPR.2016.90 (2016).
Dosovitskiy, A. et al. An image is worth 16×16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (2021).
Touvron, H. et al. Training data-efficient image transformers & distillation through attention. In Proceedings of the 38th International Conference on Machine Learning (eds. Meila, M. & Zhang, T.) vol. 139 of Proceedings of Machine Learning Research, 10347–10357 (PMLR, 2021).
Liu, Z. et al. Swin transformer: Hierarchical vision transformer using shifted windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 9992–10002 (2021).
Carion, N. et al. End-to-end object detection with transformers. In (eds. Vedaldi, A., Bischof, H., Brox, T. & Frahm, J.-M.) Computer Vision – ECCV 2020, 213–229 (Springer International Publishing, Cham, 2020).
McKinney, S. M. et al. International evaluation of an ai system for breast cancer screening. Nature 577, 89–94. https://doi.org/10.1038/s41586-019-1799-6 (2020).
Google Scholar
Battiato, S. et al. Detection and classification of pollen grain microscope images. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 4220–4227, https://doi.org/10.1109/CVPRW50498.2020.00498 (2020).
Olsson, O. et al. Efficient, automated and robust pollen analysis using deep learning. Methods Ecol. Evol. 12, 850–862, https://doi.org/10.1111/2041-210X.13575https://besjournals.onlinelibrary.wiley.com/doi/pdf/10.1111/2041-210X.13575 (2021).
Li, J. et al. How to identify pollen like a palynologist: A prior knowledge-guided deep feature learning for real-world pollen classification. Expert Syst. Appl. 237, 121392. https://doi.org/10.1016/j.eswa.2023.121392 (2024).
Google Scholar
Polling, M. et al. Neural networks for increased accuracy of allergenic pollen monitoring. Sci. Rep. 11, 11357. https://doi.org/10.1038/s41598-021-90433-x (2021).
Google Scholar
Yang, N., Joos, V., Jacquemart, A. L., Buyens, C. & De Vleeschouwer, C. Using pure pollen species when training a cnn to segment pollen mixtures. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 1694–1703, https://doi.org/10.1109/CVPRW56347.2022.00176 (2022).
Dhamija, A. R., Günther, M., Ventura, J. & Boult, T. E. The overlooked elephant of object detection: Open set. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV) 1010–1019 (2020).
Joseph, K. J., Khan, S. H., Khan, F. S. & Balasubramanian, V. N. Towards open world object detection. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 5826–5836 (2021).
Minderer, M. Simple. et al. 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings. Part X 728–755, 2022. https://doi.org/10.1007/978-3-031-20080-9_42 (Springer-Verlag, Berlin, Heidelberg, 2022).
Fang, Y. et al. You only look at one sequence: Rethinking transformer in vision through object detection. In Neural Information Processing Systems (2021).
Caron, M. et al. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 9650–9660 (2021).
Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862. https://doi.org/10.1038/s41591-024-02857-3 (2024).
Google Scholar
von Allmen, R. et al. Method development and application of object detection and classification to quaternary fossil pollen sequences. Quat. Sci. Rev. 327, 108521. https://doi.org/10.1016/j.quascirev.2024.108521 (2024).
Google Scholar
Sevillano, V., Holt, K. & Aznarte, J. L. Precise automatic classification of 46 different pollen types with convolutional neural networks. PLOS ONE 15, 1–15. https://doi.org/10.1371/journal.pone.0229751 (2020).
Google Scholar
Zu, B. et al. Reswint: enhanced pollen image classification with parallel window transformer and coordinate attention. Vis. Comput. https://doi.org/10.1007/s00371-024-03701-y (2024).
Google Scholar
Zhang, C.-J. et al. Deeppollencount: a swin-transformer-yolov5-based deep learning method for pollen counting in various plant species. Aerobiologia 40, 425–436. https://doi.org/10.1007/s10453-024-09828-8 (2024).
Google Scholar
Kubera, E., Kubik-Komar, A., Kurasiński, P., Piotrowska-Weryszko, K. & Skrzypiec, M. Detection and recognition of pollen grains in multilabel microscopic images. Sensors 22, 2690. https://doi.org/10.3390/s22072690 (2022).
Google Scholar
Li, Y., Mao, H., Girshick, R. & He, K. Exploring plain vision transformer backbones for object detection. In Computer Vision – ECCV 2022 (eds. Avidan, S., Brostow, G., Cissé, M., Farinella, G. M. & Hassner, T.) 280–296 (Springer Nature Switzerland, Cham, 2022).
Zhu, X. et al. Deformable detr: Deformable transformers for end-to-end object detection. ArXiv arXiv:2010.04159 (2020).
Meng, D. et al. Conditional detr for fast training convergence. 3631–3640, https://doi.org/10.1109/ICCV48922.2021.00363 (2021).
Tummon, F. et al. Recommended terminology for aerobiological studies: automatic and real-time monitoring methods. Aerobiologia. 41, 847-853, https://doi.org/10.1007/s10453-025-09879-5 (2025).
Selesnick, I., Baraniuk, R. & Kingsbury, N. The dual-tree complex wavelet transform. IEEE Signal Process. Mag. 22, 123–151. https://doi.org/10.1109/MSP.2005.1550194 (2005).
Google Scholar
Ravi, J. & Narmadha, R. Optimized dual-tree complex wavelet transform aided multimodal image fusion with adaptive weighted average fusion strategy. Sci. Rep. 14, 30246. https://doi.org/10.1038/s41598-024-81594-6 (2024).
Google Scholar

Download references

Acknowledgements

This work was primarily supported by Information in Images Ltd. Special thanks to Michael Broderick, the director of the company, whose support was instrumental in restarting this research. We also acknowledge Zsolt Bedőházi for his contributions to the initial software development and preliminary prototyping. We are grateful to the teams at the National Public Health Center, the Swedish Museum of Natural History, and the Réseau National de Surveillance Aérobiologique in Lyon for their efforts in preparing the data and providing reference samples. A special acknowledgment is extended to János Fillinger and his team for providing access to their facility for scanning the samples. Their expertise in pathology brought a valuable external perspective beyond the field of pollen monitoring, further enriching this study. We thank Viktor Varga for his valuable input in the final refinement of the manuscript, including suggestions for minor corrections and additional evaluations that improved the clarity and completeness of the work. The authors thank the Wigner Scientific Computing Laboratory (WSCLAB) for providing computational resources that enabled large-scale evaluations and experiments for this publication. All code development was conducted independently prior to these computations, ensuring the integrity of proprietary research and potential industrial applications.

Funding

This work was further supported by the National Research, Development, and Innovation Office of Hungary within the framework of the MILAB Artificial Intelligence National Laboratory (RRF-2.3.1-21-2022-00004) (I.C.) and the Data-Driven Health Division of National Laboratory for Health Security (RRF-2.3.1-21-2022-00006) (P.P.) and under grant No. 2020-1.1.2-PIACI-KFI-2021-00298 (A.B.). Finally, we sincerely thank Semmelweis University for generously covering the publication fee for this paper.

Author information

Authors and Affiliations

Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary
András Biricz & István Csabai
National Center for Public Health and Pharmacy, Budapest, Hungary
Donát Magyar
The Palynological Laboratory at the Swedish Museum of Natural History, Stockholm, Sweden
Björn Gedda
INRAE, UR 546 BioSP, Site Agroparc, Avignon, France
Antonio Spanu
National Korányi Institute for Pulmonology, Budapest, Hungary
János Fillinger
Department of Pathology, Forensic and Insurance Medicine, Semmelweis University, Budapest, Hungary
Adrián Pesti
Data-Driven Health Division, National Laboratory for Health Security, Health Services Management Training Centre, Budapest, Hungary
Péter Pollner
Department of Biological Physics, ELTE Eötvös Loránd University, Budapest, Hungary
Péter Pollner

Authors

András Biricz
View author publications
Search author on:PubMed Google Scholar
Donát Magyar
View author publications
Search author on:PubMed Google Scholar
Björn Gedda
View author publications
Search author on:PubMed Google Scholar
Antonio Spanu
View author publications
Search author on:PubMed Google Scholar
János Fillinger
View author publications
Search author on:PubMed Google Scholar
Adrián Pesti
View author publications
Search author on:PubMed Google Scholar
István Csabai
View author publications
Search author on:PubMed Google Scholar
Péter Pollner
View author publications
Search author on:PubMed Google Scholar

Contributions

All authors read and approved the final version of the manuscript. András Biricz: conceptualization, data curation, formal analysis, investigation, methodology, project administration, software, validation, writing – original draft Donát Magyar: resources, project administration, validation, writing – review & editing Björn Gedda: resources, project administration, validation, writing – review & editing Antonio Spanu: resources, validation, writing – review & editing János Fillinger: data curation, resources, project administration, validation Adrián Pesti: data curation, resources, validation István Csabai: conceptualization, funding acquisition, project administration, supervision, writing – review & editing Péter Pollner: conceptualization, funding acquisition, project administration, supervision, writing – review & editing

Corresponding authors

Correspondence to
András Biricz or Péter Pollner.

Ethics declarations

Competing interests

András Biricz reports contractual work with Information in Images Ltd., directed by Michael Broderick, which supported this study and is engaged in the commercial sale of microscopy devices. The company may potentially benefit from findings related to digital microscopy and dataset generation. All other authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Biricz, A., Magyar, D., Gedda, B. et al. Efficient and scalable training set generation for automated pollen monitoring with Hirst-type samplers.
Sci Rep (2025). https://doi.org/10.1038/s41598-025-31646-2

Download citation

Received: 03 June 2025
Accepted: 04 December 2025
Published: 17 December 2025
DOI: https://doi.org/10.1038/s41598-025-31646-2

Keywords

Airborne allergen analysis
Automated pollen detection
Deep learning
Hirst-type sampler
Open-vocabulary object detection
Pollen monitoring
Vision Transformer

Source: Ecology - nature.com

Efficient and scalable training set generation for automated pollen monitoring with Hirst-type samplers

Abstract

Similar content being viewed by others

Neural networks for increased accuracy of allergenic pollen monitoring

Explainable AI for unveiling deep learning pollen classification model based on fusion of scattered light patterns and fluorescence spectroscopy

Automated tick classification using deep learning and its associated challenges in citizen science

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Keywords

Reconstructing Late Quaternary coastal landscapes by a machine-learning framework

Integrated assessment of greenhouse gas emissions in extensive livestock farming systems

ITALIAN LANGUAGE

ENGLISH LANGUAGE

Abstract

Similar content being viewed by others

Neural networks for increased accuracy of allergenic pollen monitoring

Explainable AI for unveiling deep learning pollen classification model based on fusion of scattered light patterns and fluorescence spectroscopy

Automated tick classification using deep learning and its associated challenges in citizen science

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

ITALIAN LANGUAGE

ENGLISH LANGUAGE