Abstract
Haloxylon ammodendron and Haloxylon persicum are ecologically and economically important drought-tolerant plants, often referred to as “desert guardians” owing to their remarkable abilities to withstand conditions of drought and salinity. We conducted a chromosome-level genome assembly using PacBio HiFi long-read sequences and Hi-C technology in order to gain a deeper understanding of their adaptive mechanisms and resource potential. The final assembled genome sizes were 2.32 Gb (contig N50 = 5.11 Mb; Scaffolds N50 = 257.59 Mb) for H. ammodendron and 1.32 Gb (contig N50 = 9.55 Mb; Scaffolds N50 = 143.67 Mb) for H. persicum, with 97.84% and 95.45% of the respective sequences anchored to nine pseudochromosomes. The BUSCO integrity scores were 88.40% and 84.00% for H. ammodendron and H. persicum, respectively. Gene annotation revealed that H. ammodendron contained 69,844 protein-coding genes, while H. persicum had 66,859 protein-coding genes, with repeat elements constituting 57.42% and 52.88% of their genomes, respectively. The reference genomes of Haloxylon serve as invaluable resources for exploring the ecological and economic significance of desert plants.
Data availability
All raw sequencing data can be obtained from NCBI genome database, the genome annotation files can be accessed from the Figshare databases.
Code availability
All software involved in this research analysis was implemented according to the manuals and protocols provided by the software developers. The software versions were listed in the methods. Custom code was not used in this study.
References
Li, J. Y., Chang, H., Liu, T. & Zhang, C. The potential geographical distribution of Haloxylon across Central Asia under climate change in the 21st century. Agric for Meteorol 275, 243–254, https://doi.org/10.1016/j.agrformet.2019.05.027 (2019).
Ehsan, A. B. D. I., Hamid, R. S., Baris, M. & Azade, D. Soil fixation and erosion control by Haloxylon persicum roots in arid lands, Iran. J Arid Land 11(1), 86–96, https://doi.org/10.1007/s40333-018-0021-2 (2019).
Shi, S. et al. Sandstorms damage the photosynthetic activities of Haloxylon ammodendron seedlings. Acta Physiol Plant 45, 54, https://doi.org/10.1007/s11738-023-03528-5 (2023).
Yang, Y. & Lv, G. H. Characterization of the gene expression profile response to drought stress in Haloxylon using PacBio single-molecule real-time and Illumina sequencing. Front Plant Sci 15, 1–18, https://doi.org/10.3389/fpls.2022.981029 (2022).
Thevs, N., Wucherer, W. & Buras, A. Spatial distribution and carbon stock of the Saxaul vegetation of the winter-cold deserts of Middle Asia. J Arid Environ 90, 29–35, https://doi.org/10.1016/j.jaridenv.2012.10.013 (2013).
Zhang, C. et al. The spatiotemporal patterns of vegetation coverage and biomass of the temperate deserts in Central Asia and their relationships with climate controls. Remote Sens Environ 175, 271–281, https://doi.org/10.1016/j.rse.2016.01.002 (2016).
Jiang, Y. & Tu, P. F. Analysis of chemical constituents in Cistanche species. J Chromatogr A 1216, 1970–1979, https://doi.org/10.1016/j.chroma.2008.07.031 (2009).
Lü, X. P. et al. Dynamic responses of Haloxylon ammodendron to various degrees of simulated drought stress. Plant Physiol Biochem 139, 121–131, https://doi.org/10.1016/j.plaphy.2019.03.019 (2019).
Fan, L. et al. Transcriptomic view of survival during early seedling growth of the extremeophyte Haloxylon ammodendron. Plant Physiol Bioch 132, 475–489, https://doi.org/10.1016/j.plaphy.2018.09.024 (2018).
Li, C. J. et al. Morphological and physiological responses of desert plants to drought stress in a man-made landscape of the Taklimakan desert shelter belt. Ecol indic 140, 109037, https://doi.org/10.1016/j.ecolind.2022.109037 (2022).
Yang, Y. & Lv, G. H. Combined analysis of transcriptome and metabolome reveals the molecular mechanism and candidate genes of Haloxylon drought tolerance. Front Plant Sci 13, 1–24, https://doi.org/10.3389/fpls.2022.1020367 (2022).
Yang, L. et al. Insights into the multi-chromosomal mitochondrial genome structure of the xero-halophytic plant Haloxylon ammodendron (C.A.Mey.) Bunge ex Fenzl. BMC Genomics 25, 123, https://doi.org/10.1186/s12864-024-10026-6 (2024).
Porebski, S., Bailey, L. G. & Baum, B. R. Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant Mol Biol Rep 15, 8–15, https://doi.org/10.1007/BF02772108 (1997).
Chen, S. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta 2, e107, https://doi.org/10.1002/imt2.107 (2023).
Marcais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 7(6), 764–770, https://doi.org/10.1093/bioinformatics/btr011 (2011).
Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun 11, 1432, https://doi.org/10.1038/s41467-020-14998-3 (2020).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods 18, 170–175, https://doi.org/10.1038/s41592-020-01056-5 (2021).
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst 3(1), 95–98, https://doi.org/10.1016/j.cels.2016.07.002 (2016).
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst 3(1), 99–101, https://doi.org/10.1016/j.cels.2015.07.012 (2016).
Gu, Z., Gu, L., Eils, R., Schlesner, M. & Brors, B. Circlize implements and enhances circular visualization in R. Bioinformatics 30, 2811–2812, https://doi.org/10.1093/bioinformatics/btu393 (2014).
Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res 46, e126–e126, https://doi.org/10.1093/nar/gky730 (2018).
Chen, J. et al. A complete telomere-to-telomere assembly of the maize genome. Nat Genet 55, 1221–1231, https://doi.org/10.1038/s41588-023-01419-6 (2023).
Bedell, J. A., Korf, I. & Gish, W. MaskerAid: a performance enhancement to RepeatMasker. Bioinformatics 16, 1040–1041, https://doi.org/10.1093/bioinformatics/16.11.1040 (2000).
Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12, 491, https://doi.org/10.1186/1471-2105-12-491 (2011).
Brůna, T., Lomsadze, A. & Borodovsky, M. GeneMark-EP+: Eukaryotic gene prediction with self-training in the space of genes and proteins. NAR Genomics Bioinforma 2, 1–14, https://doi.org/10.1093/nargab/lqaa026 (2020).
Stanke, M., Steinkamp, R., Waack, S. & Morgenstern, B. AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res 32, 309–312, https://doi.org/10.1093/nar/gkh379 (2004).
Kim, D. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915, https://doi.org/10.1038/s41587-019-0201-4 (2019).
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 9, 1–22, https://doi.org/10.1186/gb-2008-9-1-r7 (2008).
Boratyn, G. M. et al. Blast: a more efficient report with usability improvements. Nucleic Acids Res 41, W29–W33, https://doi.org/10.1093/nar/gkt282 (2013).
Zdobnov, E. M. & Apweiler, R. InterProScan–an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847–848, https://doi.org/10.1093/bioinformatics/17.9.847 (2001).
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28, 27–30, https://doi.org/10.1093/nar/28.1.27 (2000).
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25, 25–29 (2000).
Koonin, E. V. et al. A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes. Genome Biol 5, 1–28, https://doi.org/10.1186/gb-2004-5-2-r7 (2004).
Mistry, J. et al. Pfam: The protein families database in 2021. Nucleic Acids Res 49, D412–D419, https://doi.org/10.1093/nar/gkaa913 (2020).
Boeckmann, B. et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 31, 365–370, https://doi.org/10.1093/nar/gkg095 (2003).
Marchler-Bauer, A. et al. CDD: a conserved domain database for the functional annotation of proteins. Nucleic Acids Res 39, D225–D229, https://doi.org/10.1093/nar/gkq1189 (2010).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17134273 (2022).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17134270 (2022).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17134264 (2022).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17134268 (2022).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17134267 (2022).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRX17134266 (2022).
NCBI GenBank https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_053755035.1/ (2025).
NCBI GenBank https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_053755055.1/ (2025).
Yang, F. et al. Chromosomal-level genome assembly of the dominant desert shrub Haloxylon (Amaranthaceae). Figshare https://doi.org/10.6084/m9.figshare.28399244 (2025).
Acknowledgements
We thank the associate editor and anonymous reviewers for their helpful comments. This work was supported by the Central Government Guide Local Special Fund Projects for Science and Technology Development (ZYYD2025ZY04), the National Natural Science Foundation of China (Youth Fund) (32301307), and the Natural Science Foundation of Xinjiang Uygur Autonomous Region (2023D01C186). We thank LetPub (www.letpub.com.cn) for linguistic assistance and pre-submission expert review.
Author information
Authors and Affiliations
Contributions
F.Y. conceived the study; F.Y. collected data, analyzed the data, and drafted the text; D.G. collected and analyzed the data; Y.F.W. and X.L.D. collected samples; G.L. revised the manuscript. All of the authors edited the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Reprints and permissions
About this article
Cite this article
Yang, F., Gao, D., Wang, Y. et al. Chromosomal-level genome assembly of two dominant desert shrub species in Haloxylon (Amaranthaceae).
Sci Data (2025). https://doi.org/10.1038/s41597-025-06514-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-06514-3
Source: Ecology - nature.com
