in

De novo assembly of complete circular mitochondrial genomes from 2,695 fungal species


Abstract

Fungal mitochondrial genomes are critical for understanding phylogenetics, evolution, and ecology of the Kingdom Fungi, yet they remain underrepresented in public databases. To address this, we developed a workflow to recover mitochondrial genomes from 12,902 fungal short read sequencing data housed in the Sequence Read Archive (SRA) records, assembling complete circular genomes from 2,695 species. This effort expanded fungal mitochondrial genome diversity by nearly 2.3X particularly in understudied phyla such as Mucoromycota (11X increase) and Zoopagomycota (8X increase). The new dataset contains novel yet undescribed mitochondrial genomes at numerous taxonomic levels, including 15 classes, 64 orders, 178 families, and 544 genera. Taxonomic analysis revealed broad ecological representation among the top-assembled species, including human pathogens (e.g., Cryptococcus tetragattii), plant pathogens (e.g., Melampsora larici-populina), edible mushrooms (e.g., Suillus luteus), and industrial fungi. By leveraging the not yet fully exploited SRA sequencing data, this study fills critical gaps in fungal mitochondrial genomics, tripling the currently known mitochondrial genome diversity of the Kingdom Fungi, and provides an extensive resource for phylogenetic and evolutionary research.

Similar content being viewed by others

Large-scale genomic analyses with machine learning uncover predictive patterns associated with fungal phytopathogenic lifestyles and traits

A genomic perspective on fungal diversity and evolution

Discovering the hidden function in fungal genomes

Code availability

The assembly workflow was implemented in a python script (assembly_workflow.py) passing SRA run accession as input and outputting the assembly contigs and graphs, which are used by GetOrganelle for mitochondrial genome extraction (Methods). The script uses already published tools and explained in the Methods section. The script is available on GitHub at https://github.com/msabrysarhan/fungal_mtDNA.

Data availability

Nucleotide sequence data reported are available in the Third Party Annotation Section of the DDBJ/ENA/GenBank databases under the BioProject PRJNA1367877 and the accession numbers TPA: BK072095-BK074789, and the metadata is available at https://doi.org/10.6084/m9.figshare.28750034.

References

  1. Hawksworth, D. L. & Lücking, R. J. M. s. Fungal diversity revisited: 2.2 to 3.8 million species. 5, https://doi.org/10.1128/microbiolspec. funk-0052-2016 (2017).

  2. Paterson, R. R. M., Solaiman, Z. & Santamaria, O. J. S. R. Guest edited collection: fungal evolution and diversity. 13, 21438 (2023).

  3. James, T. Y., Stajich, J. E., Hittinger, C. T. & Rokas, A. J. A. R. O. M. Toward a fully resolved fungal tree of life. 74, 291-313 (2020).

  4. Chethana, K. T. et al. What are fungal species and how to delineate them? 109, 1-25 (2021).

  5. Li, Y. et al. A genome-scale phylogeny of the kingdom Fungi. 31, 1653-1665. e1655 (2021).

  6. Kouvelis, V. N., Kortsinoglou, A. M., James, T. Y. J. E. o. F. & Organisms, F.-L. The evolution of mitochondrial genomes in fungi. 65-90 (2023).

  7. Kulik, T., Van Diepeningen, A. D. & Hausner, G. J. F. i. M. Vol. 11 628579 (Frontiers Media SA, 2021).

  8. Song, N., Geng, Y. & Li, X. J. F. i. M. The mitochondrial genome of the phytopathogenic fungus Bipolaris sorokiniana and the utility of mitochondrial genome to infer phylogeny of Dothideomycetes. 11, 863 (2020).

  9. Zhang, S. et al. Dynamic evolution of eukaryotic mitochondrial and nuclear genomes: a case study in the gourmet pine mushroom Tricholoma matsutake. 23, 7214-7230 (2021).

  10. Sauters, T. J. & Rokas, A. J. C. B. Patterns and mechanisms of fungal genome plasticity. 35, R527-R544 (2025).

  11. Jung, H. et al. Twelve quick steps for genome assembly and annotation in the classroom. 16, e1008325 (2020).

  12. Persoons, A. et al. Patterns of genomic variation in the poplar rust fungus Melampsora larici-populina identify pathogenesis-related factors. 5, 450 (2014).

  13. Schoch, C. L. et al. NCBI Taxonomy: a comprehensive update on curation, resources and tools. 2020, baaa062 (2020).

  14. Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890, https://doi.org/10.1093/bioinformatics/bty560 (2018).

    Google Scholar 

  15. Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome research 27, 824–834 (2017).

    Google Scholar 

  16. Jin, J.-J. et al. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biology 21, 241, https://doi.org/10.1186/s13059-020-02154-5 (2020).

    Google Scholar 

  17. Lang, B. F. et al. Mitochondrial genome annotation with MFannot: a critical analysis of gene identification and gene model prediction. 14, 1222186 (2023).

  18. Katoh, K. & Standley, D. M. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Molecular Biology and Evolution 30, 772–780, https://doi.org/10.1093/molbev/mst010 (2013).

    Google Scholar 

  19. Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. J. B. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. 25, 1972-1973 (2009).

  20. Borowiec, M. L. J. P. AMAS: a fast tool for alignment manipulation and computing of summary statistics. 4, e1660 (2016).

  21. Price, M. N., Dehal, P. S. & Arkin, A. P. J. P. O. FastTree 2–approximately maximum-likelihood trees for large alignments. 5, e9490 (2010).

  22. Sarhan, M. S., Abdalrahem, A., Maixner, F. & Fuchsberger, C. NCBI GenBank https://identifiers.org/ncbi/bioproject:PRJNA1367877 (2025).

  23. Sarhan, M. S., Abdalrahem, A., Maixner, F. & Fuchsberger, C. De novo assembly of complete circular mitochondrial genomes from 2,695 fungal species. figshare https://doi.org/10.6084/m9.figshare.28750034 (2025).

  24. Fonseca, P. L. et al. Global characterization of fungal mitogenomes: new insights on genomic diversity and dynamism of coding genes and accessory elements. 12, 787283 (2021).

  25. Wijayawardene, N. N. et al. Classes and phyla of the kingdom Fungi. 128, 1-165 (2024).

Download references

Acknowledgements

This work was supported by the “MOC – MultiOmics Centre for Food and Health” project. The MOC project is co-funded by the European Union (European Regional Development Fund – EFRE). Ammar Abdalrahem was supported by a PhD fellowship from the French Ministry of Education and Research (MESR) and by the French Plan Investissement d’Avenir (PIA) Lab of Excellence ARBRE [ANR-11-LABX-0002- 01]. The authors thank the Department of Innovation, Research and University of the Autonomous Province of Bozen/Bolzano, Italy for covering the Open Access publication costs.

Author information

Authors and Affiliations

Authors

Contributions

M.S.S. conceived the original idea. M.S.S. and A.A. designed and performed the computational analysis. M.S.S. performed the data visualization and wrote the first draft of the manuscript. M.S.S. and A.A. curated the data for public deposition. F.M. and C.F. edited and revised the manuscript. All authors read and approved the final version of the manuscript.

Corresponding author

Correspondence to
Mohamed S. Sarhan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Sarhan, M.S., Abdalrahem, A., Maixner, F. et al. De novo assembly of complete circular mitochondrial genomes from 2,695 fungal species.
Sci Data (2025). https://doi.org/10.1038/s41597-025-06447-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41597-025-06447-x


Source: Ecology - nature.com

Author Correction: Sociality predicts orangutan vocal phenotype

Telemetry reveals potential mating aggregation behavior of tiger sharks (Galeocerdo cuvier) in Hawaiʻi

Back to Top