in

Chromosome-level genome assembly of Utricularia aurea Lour., a canivorious higher plant with minute genome


Abstract

Utricularia aurea Lour. is an aquatic herbaceous plant with yellow flowers and a unique insect-trapping mechanism, distributed in several regions of China and other Asian countries. In this study, we assembled a chromosome-scale genome of U. aurea with a size of approximately 180.31 Mb and a contig N50 length of 7.2 Mb. Using Oxford Nanopore long reads and Hi-C sequencing data, 99.99% of the assembled sequences were anchored onto 20 pseudo-chromosomes. We predicted a total of 33,365 protein-coding genes, of which 97.33% were functionally annotated using public databases including NR, GO, KOG, KEGG, and Swissprot. Additionally, we identified 117 rRNAs, 509 sRNAs and 382 tRNAs from the genome. The chromosome-scale Utricularia aurea genome will facilitate investigations into the genomic basis of its carnivorous adaptation, improve our understanding of the evolution history of carnivorous habits in Utricularia, and enrich research on the adaptive evolution of carnivorous plant genomes.

Similar content being viewed by others

A chromosome-level genome assembly of the Echiura Urechis unicinctus

Chromosome-level genome assembly of Euphorbia tirucalli (Euphorbiaceae), a highly stress-tolerant oil plant

Genomic skimming and nanopore sequencing uncover cryptic hybridization in one of world’s most threatened primates

Data availability

The genome assembly of Utricularia aurea is available in GenBank under accession number JBPYBC000000000 (https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_056322005.1/), linked to BioProject PRJNA1289348 (https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA1289348) including SRA database with accession number SRR3484300033, SRR3484415234, SRR3484415335, SRR34844667- SRR3484467036,37,38,39. All raw sequencing data and assembled sequences are publicly accessible. The assembled genome and annotation results have been deposited in the figshare database (https://doi.org/10.6084/m9.figshare.29833016)41.

Code availability

All data analyses were performed using standard bioinformatic tools that can freely obtained to the public, with parameters and software versions are detailed in the Methods section and Supplementary Table S1. All software and code utilized in this study are publicly accessible.

References

  1. Taylor, P. G. The Genus Utricularia: A Taxonomic Monograph. First edn, Vol. 43 (Royal Botanic Gardens, Kew, 1989).

  2. Król, E. et al. Quite a few reasons for calling carnivores “the most wonderful plants in the world. Annals of Botany 109, 47–64, https://doi.org/10.1093/aob/mcr249 (2011).

    Google Scholar 

  3. Poppinga, S., Weisskopf, C., Westermeier, A., Masselter, T. & Speck, T. Fastest predators in plant kingdom: Functional morphology and biomechanics of suction traps found in the largest genus of carnivorous plants. AoB Plants 8, plv140, https://doi.org/10.1093/aobpla/plv140 (2015).

    Google Scholar 

  4. Rutishauser, R. & Isler, B. Developmental Genetics and Morphological Evolution of Flowering Plants, Especially Bladderworts (Utricularia): Fuzzy Arberian morphology complements classical morphology. Annals of Botany 88, 1173–1202, https://doi.org/10.1006/anbo.2001.1498 (2001).

    Google Scholar 

  5. Ibarra-Laclette, E. et al. Architecture and evolution of a minute plant genome. Nature 498, 94–98, https://doi.org/10.1038/nature12132 (2013).

    Google Scholar 

  6. Frantiek, Z. et al. The smallest angiosperm genomes may be the price for effective traps of bladderworts. Annals of Botany 134, 1131–1138, https://doi.org/10.1093/aob/mcae107 (2024).

    Google Scholar 

  7. Lan, T. et al. Long-read sequencing uncovers the adaptive topography of a carnivorous plant genome. Proceedings of the National Academy of Sciences 114, E4435–E4441, https://doi.org/10.1073/pnas.1702072114 (2017).

    Google Scholar 

  8. Silva, S. et al. The Terrestrial Carnivorous Plant Utricularia reniformis Sheds Light on Environmental and Life-Form Genome Plasticity. International Journal of Molecular Sciences 21, e3, https://doi.org/10.3390/ijms21010003 (2020).

    Google Scholar 

  9. Liu, B. et al. Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects. Quantitative Biology 35, 62–67, https://doi.org/10.48550/arXiv.1308.2012 (2013).

    Google Scholar 

  10. Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nature communications 11, 1432, https://doi.org/10.1038/s41467-020-14998-3 (2020).

    Google Scholar 

  11. Wick, R. R., Judd, L. M. & Holt, K. E. Performance of neural network basecalling tools for Oxford Nanopore sequencing. Genome biology 20, 129, https://doi.org/10.1101/543439 (2019).

    Google Scholar 

  12. Cali, D. S., Kim, J. S., Ghose, S., Alkan, C. & Mutlu, O. Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions. Brief Bioinform 20, 1542–1559, https://doi.org/10.1093/BIB/BBY017 (2019).

    Google Scholar 

  13. Hu, J., Fan, J. P., Sun, Z. Y. & Liu, S. L. NextPolish: A fast and efficient genome polishing tool for long-read assembly. Bioinformatics (Oxford, England) 36, 3210–3212, https://doi.org/10.1093/bioinformatics/btz891 (2019).

    Google Scholar 

  14. Chen, S. F., Zhou, Y. Q., Chen, Y. R. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890, https://doi.org/10.1093/bioinformatics/bty560 (2018).

    Google Scholar 

  15. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nature Methods 9, 357–359, https://doi.org/10.1038/nmeth.1923 (2012).

    Google Scholar 

  16. Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome biology 16, 259, https://doi.org/10.1186/s13059-015-0831-x (2015).

    Google Scholar 

  17. Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nature Biotechnology 31, 1119–1125, https://doi.org/10.1038/nbt.2727 (2013).

    Google Scholar 

  18. Wang, X. W. & Wang, L. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing. Frontiers in Plant Science 7, 1350, https://doi.org/10.3389/fpls.2016.01350 (2016).

    Google Scholar 

  19. Gary, B. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 27, 573–580, https://doi.org/10.1093/nar/27.2.573 (1999).

    Google Scholar 

  20. Han, Y. & Wessler, S. R. MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res 38, e199, https://doi.org/10.1093/nar/gkq862 (2010).

    Google Scholar 

  21. Bedell, J. A., Ian, K. & Warren, G. MaskerAid: a performance enhancement to RepeatMasker. Bioinformatics 16, 1040–1041, https://doi.org/10.1093/bioinformatics/16.11.1040 (2000).

    Google Scholar 

  22. Keilwagen, J. et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Research 44, e89–e89, https://doi.org/10.1093/nar/gkw092 (2016).

    Google Scholar 

  23. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21, https://doi.org/10.1093/bioinformatics/bts635 (2013).

    Google Scholar 

  24. Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome biology 9, R7, https://doi.org/10.1186/gb-2008-9-1-r7 (2008).

    Google Scholar 

  25. Pertea, M., Kim, D., Pertea, G. M., Leek, J. T. & Salzberg, S. L. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nature Protocols 11, 1650–1667, https://doi.org/10.1038/nprot.2016.095 (2016).

    Google Scholar 

  26. Stanke, M., Diekhans, M., Baertsch, R. & Haussler, D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24, 637–644, https://doi.org/10.1093/bioinformatics/btn013 (2008).

    Google Scholar 

  27. Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25, 955–964, https://doi.org/10.1093/nar/25.5.955 (1997).

    Google Scholar 

  28. Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935, https://doi.org/10.1093/bioinformatics/btt509 (2013).

    Google Scholar 

  29. Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res 33, D121–124, https://doi.org/10.1093/nar/gki081 (2005).

    Google Scholar 

  30. Lagesen, K. et al. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35, 3100–3108, https://doi.org/10.1093/nar/gkm160 (2007).

    Google Scholar 

  31. Zdobnov, E. M. & Apweiler, R. InterProScan–an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847–848, https://doi.org/10.1093/bioinformatics/17.9.847 (2001).

    Google Scholar 

  32. McGinnis, S. & Madden, T. L. BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 32, W20–W25, https://doi.org/10.1093/nar/gkh435 (2004).

    Google Scholar 

  33. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34843000 (2025). 1.

  34. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844152 (2025). 2.

  35. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844153 (2025). 3.

  36. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844667 (2025). 4.

  37. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844668 (2025). 5.

  38. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844669 (2025). 6.

  39. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844670 (2025). 7.

  40. NCBI Assembly https://www.ncbi.nlm.nih.gov/nuccore/JBPYBC000000000.1/ (2025).

  41. Yu, J. & Dong, H. Chromosome-level genome assembly of Utricularia aurea Lour., a canivorious higher plant with minute genome. figshare https://doi.org/10.6084/m9.figshare.29833016 (2025).

  42. Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595, https://doi.org/10.1093/bioinformatics/btp698 (2010).

    Google Scholar 

  43. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079, https://doi.org/10.1093/bioinformatics/btp352 (2009).

    Google Scholar 

  44. Danecek, P. & McCarthy, S. A. BCFtools/csq: haplotype-aware variant consequences. Bioinformatics 33, 2037–2039, https://doi.org/10.1093/bioinformatics/btx100 (2017).

    Google Scholar 

  45. Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067, https://doi.org/10.1093/bioinformatics/btm071 (2007).

    Google Scholar 

  46. Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212, https://doi.org/10.1093/bioinformatics/btv351 (2015).

    Google Scholar 

  47. Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome biology 21, 245, https://doi.org/10.1186/s13059-020-02134-9 (2020).

    Google Scholar 

Download references

Acknowledgements

This work is supported by Hubei Provincial Department of Science and Technology Innovation Platform Plan Project (2024CSA071) and the National Science and Technology Fundamental Resources Investigation Program of China (2019FY101809).

Author information

Authors and Affiliations

Authors

Contributions

Jiaojun Yu, Shisheng Li, Hongjin Dong conceived and designed the study, and revised the manuscript. Shisheng Li, Hongjin Dong collected plant material. Jiaojun Yu performed experiments, analyzed the data and wrote the manuscript. All authors approved the final manuscript.

Corresponding author

Correspondence to
Hongjin Dong.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Table S1-S5 (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yu, J., Li, S. & Dong, H. Chromosome-level genome assembly of Utricularia aurea Lour., a canivorious higher plant with minute genome.
Sci Data (2026). https://doi.org/10.1038/s41597-026-07285-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41597-026-07285-1


Source: Ecology - nature.com

Author Correction: Emergent patterns of patchiness differ between physical and planktonic properties in the ocean

Effect of poultry litter amended with biochar or zeolite on nutrient availability, fruit quality, and yield of acid lime in calcareous sandy soil

Back to Top