Abstract
Utricularia aurea Lour. is an aquatic herbaceous plant with yellow flowers and a unique insect-trapping mechanism, distributed in several regions of China and other Asian countries. In this study, we assembled a chromosome-scale genome of U. aurea with a size of approximately 180.31 Mb and a contig N50 length of 7.2 Mb. Using Oxford Nanopore long reads and Hi-C sequencing data, 99.99% of the assembled sequences were anchored onto 20 pseudo-chromosomes. We predicted a total of 33,365 protein-coding genes, of which 97.33% were functionally annotated using public databases including NR, GO, KOG, KEGG, and Swissprot. Additionally, we identified 117 rRNAs, 509 sRNAs and 382 tRNAs from the genome. The chromosome-scale Utricularia aurea genome will facilitate investigations into the genomic basis of its carnivorous adaptation, improve our understanding of the evolution history of carnivorous habits in Utricularia, and enrich research on the adaptive evolution of carnivorous plant genomes.
Similar content being viewed by others
A chromosome-level genome assembly of the Echiura Urechis unicinctus
Chromosome-level genome assembly of Euphorbia tirucalli (Euphorbiaceae), a highly stress-tolerant oil plant
Genomic skimming and nanopore sequencing uncover cryptic hybridization in one of world’s most threatened primates
Data availability
The genome assembly of Utricularia aurea is available in GenBank under accession number JBPYBC000000000 (https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_056322005.1/), linked to BioProject PRJNA1289348 (https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA1289348) including SRA database with accession number SRR3484300033, SRR3484415234, SRR3484415335, SRR34844667- SRR3484467036,37,38,39. All raw sequencing data and assembled sequences are publicly accessible. The assembled genome and annotation results have been deposited in the figshare database (https://doi.org/10.6084/m9.figshare.29833016)41.
Code availability
All data analyses were performed using standard bioinformatic tools that can freely obtained to the public, with parameters and software versions are detailed in the Methods section and Supplementary Table S1. All software and code utilized in this study are publicly accessible.
References
Taylor, P. G. The Genus Utricularia: A Taxonomic Monograph. First edn, Vol. 43 (Royal Botanic Gardens, Kew, 1989).
Król, E. et al. Quite a few reasons for calling carnivores “the most wonderful plants in the world. Annals of Botany 109, 47–64, https://doi.org/10.1093/aob/mcr249 (2011).
Poppinga, S., Weisskopf, C., Westermeier, A., Masselter, T. & Speck, T. Fastest predators in plant kingdom: Functional morphology and biomechanics of suction traps found in the largest genus of carnivorous plants. AoB Plants 8, plv140, https://doi.org/10.1093/aobpla/plv140 (2015).
Rutishauser, R. & Isler, B. Developmental Genetics and Morphological Evolution of Flowering Plants, Especially Bladderworts (Utricularia): Fuzzy Arberian morphology complements classical morphology. Annals of Botany 88, 1173–1202, https://doi.org/10.1006/anbo.2001.1498 (2001).
Ibarra-Laclette, E. et al. Architecture and evolution of a minute plant genome. Nature 498, 94–98, https://doi.org/10.1038/nature12132 (2013).
Frantiek, Z. et al. The smallest angiosperm genomes may be the price for effective traps of bladderworts. Annals of Botany 134, 1131–1138, https://doi.org/10.1093/aob/mcae107 (2024).
Lan, T. et al. Long-read sequencing uncovers the adaptive topography of a carnivorous plant genome. Proceedings of the National Academy of Sciences 114, E4435–E4441, https://doi.org/10.1073/pnas.1702072114 (2017).
Silva, S. et al. The Terrestrial Carnivorous Plant Utricularia reniformis Sheds Light on Environmental and Life-Form Genome Plasticity. International Journal of Molecular Sciences 21, e3, https://doi.org/10.3390/ijms21010003 (2020).
Liu, B. et al. Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects. Quantitative Biology 35, 62–67, https://doi.org/10.48550/arXiv.1308.2012 (2013).
Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nature communications 11, 1432, https://doi.org/10.1038/s41467-020-14998-3 (2020).
Wick, R. R., Judd, L. M. & Holt, K. E. Performance of neural network basecalling tools for Oxford Nanopore sequencing. Genome biology 20, 129, https://doi.org/10.1101/543439 (2019).
Cali, D. S., Kim, J. S., Ghose, S., Alkan, C. & Mutlu, O. Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions. Brief Bioinform 20, 1542–1559, https://doi.org/10.1093/BIB/BBY017 (2019).
Hu, J., Fan, J. P., Sun, Z. Y. & Liu, S. L. NextPolish: A fast and efficient genome polishing tool for long-read assembly. Bioinformatics (Oxford, England) 36, 3210–3212, https://doi.org/10.1093/bioinformatics/btz891 (2019).
Chen, S. F., Zhou, Y. Q., Chen, Y. R. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890, https://doi.org/10.1093/bioinformatics/bty560 (2018).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nature Methods 9, 357–359, https://doi.org/10.1038/nmeth.1923 (2012).
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome biology 16, 259, https://doi.org/10.1186/s13059-015-0831-x (2015).
Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nature Biotechnology 31, 1119–1125, https://doi.org/10.1038/nbt.2727 (2013).
Wang, X. W. & Wang, L. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing. Frontiers in Plant Science 7, 1350, https://doi.org/10.3389/fpls.2016.01350 (2016).
Gary, B. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 27, 573–580, https://doi.org/10.1093/nar/27.2.573 (1999).
Han, Y. & Wessler, S. R. MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res 38, e199, https://doi.org/10.1093/nar/gkq862 (2010).
Bedell, J. A., Ian, K. & Warren, G. MaskerAid: a performance enhancement to RepeatMasker. Bioinformatics 16, 1040–1041, https://doi.org/10.1093/bioinformatics/16.11.1040 (2000).
Keilwagen, J. et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Research 44, e89–e89, https://doi.org/10.1093/nar/gkw092 (2016).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21, https://doi.org/10.1093/bioinformatics/bts635 (2013).
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome biology 9, R7, https://doi.org/10.1186/gb-2008-9-1-r7 (2008).
Pertea, M., Kim, D., Pertea, G. M., Leek, J. T. & Salzberg, S. L. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nature Protocols 11, 1650–1667, https://doi.org/10.1038/nprot.2016.095 (2016).
Stanke, M., Diekhans, M., Baertsch, R. & Haussler, D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24, 637–644, https://doi.org/10.1093/bioinformatics/btn013 (2008).
Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25, 955–964, https://doi.org/10.1093/nar/25.5.955 (1997).
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935, https://doi.org/10.1093/bioinformatics/btt509 (2013).
Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res 33, D121–124, https://doi.org/10.1093/nar/gki081 (2005).
Lagesen, K. et al. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35, 3100–3108, https://doi.org/10.1093/nar/gkm160 (2007).
Zdobnov, E. M. & Apweiler, R. InterProScan–an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847–848, https://doi.org/10.1093/bioinformatics/17.9.847 (2001).
McGinnis, S. & Madden, T. L. BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 32, W20–W25, https://doi.org/10.1093/nar/gkh435 (2004).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34843000 (2025). 1.
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844152 (2025). 2.
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844153 (2025). 3.
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844667 (2025). 4.
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844668 (2025). 5.
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844669 (2025). 6.
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR34844670 (2025). 7.
NCBI Assembly https://www.ncbi.nlm.nih.gov/nuccore/JBPYBC000000000.1/ (2025).
Yu, J. & Dong, H. Chromosome-level genome assembly of Utricularia aurea Lour., a canivorious higher plant with minute genome. figshare https://doi.org/10.6084/m9.figshare.29833016 (2025).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595, https://doi.org/10.1093/bioinformatics/btp698 (2010).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079, https://doi.org/10.1093/bioinformatics/btp352 (2009).
Danecek, P. & McCarthy, S. A. BCFtools/csq: haplotype-aware variant consequences. Bioinformatics 33, 2037–2039, https://doi.org/10.1093/bioinformatics/btx100 (2017).
Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067, https://doi.org/10.1093/bioinformatics/btm071 (2007).
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212, https://doi.org/10.1093/bioinformatics/btv351 (2015).
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome biology 21, 245, https://doi.org/10.1186/s13059-020-02134-9 (2020).
Acknowledgements
This work is supported by Hubei Provincial Department of Science and Technology Innovation Platform Plan Project (2024CSA071) and the National Science and Technology Fundamental Resources Investigation Program of China (2019FY101809).
Author information
Authors and Affiliations
Contributions
Jiaojun Yu, Shisheng Li, Hongjin Dong conceived and designed the study, and revised the manuscript. Shisheng Li, Hongjin Dong collected plant material. Jiaojun Yu performed experiments, analyzed the data and wrote the manuscript. All authors approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Table S1-S5 (download XLSX )
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Reprints and permissions
About this article
Cite this article
Yu, J., Li, S. & Dong, H. Chromosome-level genome assembly of Utricularia aurea Lour., a canivorious higher plant with minute genome.
Sci Data (2026). https://doi.org/10.1038/s41597-026-07285-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-026-07285-1
Source: Ecology - nature.com
