Abstract
The kuruma shrimp Marsupenaeus japonicus is one of the most economically important shrimp species in the world. Here, we constructed a chromosome-level genome assembly of M. japonicus by combining PacBio long reads, Illumina short reads and Hi-C scaffolding. The genome size was 1.64 Gb with a scaffold N50 length of 40.61 Mb, and 97.83% (1.60 Gb) of the assembled sequences were anchored to 43 chromosomes. The genome contained 62.79% repeat sequences and 21,172 protein-coding genes, of which 83.20% were functionally annotated. The completeness of M. japonicus genome assembly is highlighted by a BUSCO score of 91.0%. Evolutionary analysis indicated that M. japonicus was closely related to Litopenaeus vannamei and Penaeus monodon, with an estimated divergence time from their common ancestor of 88.33 million years ago. In sum, the chromosome-level genome assembly and annotation provide fundamental resources for genetic breeding and molecular mechanism related studies of M. japonicus.
Similar content being viewed by others
A high-quality chromosome-level genome assembly of Pacific whiteleg shrimp (Penaeus vannamei)
Chromosome-level genome assembly of ridgetail white shrimp Exopalaemon carinicauda
A high-quality chromosome-level genome assembly and annotation of the giant freshwater prawn (Macrobrachium rosenbergii)
Data availability
All the raw sequencing data used for genome assembly were deposited in the NCBI Sequence Read Archive (SRA) database under the accession number SRP58162351. The chromosome-level assembly of the M. japonicus genome was deposited in the European Nucleotide Archive (ENA) under the accession number PRJEB10264252. The genome annotation files were deposited at the figshare (https://doi.org/10.6084/m9.figshare.28874273.v1)53.
Code availability
All commands and pipelines used in data processing were executed according to the manual and protocols of the corresponding bioinformatic software. No specific code has been developed for this study.
References
Tsoi, K. H. et al. Verification of the cryptic species Penaeus pulchricaudatus in the commercially important kuruma shrimp P. japonicus (Decapoda, Penaeidae) using molecular taxonomy. Invertebr Syst. 28, 476–490, https://doi.org/10.1071/IS14001 (2014).
Tsoi, K. H., Chan, T. Y. & Chu, K. H. Molecular population structure of the kuruma shrimp Penaeus japonicus species complex in western Pacific. Mar Biol. 150, 1345–1364, https://doi.org/10.1007/s00227-006-0426-x (2007).
Duan, Y. F. et al. Effect of desiccation and resubmersion on the oxidative stress response of the kuruma shrimp Marsupenaeus japonicus. Fish shellfish Immun. 49, 91–99, https://doi.org/10.1016/j.fsi.2015.12.018 (2016).
Wang, P. P. et al. Air exposure affects physiological responses, innate immunity, apoptosis and DNA methylation of kuruma shrimp, Marsupenaeus japonicus. Front Physiol. 11, 223, https://doi.org/10.3389/fphys.2020.00223 (2020).
Francis, B. et al. Reproductive performance, salinity tolerance, growth and production performance of a cryptic species Penaeus (Marsupenaeus) japonicus. Aquac Res. 52, 5506–5516, https://doi.org/10.1111/are.15424 (2021).
Hudinaga, M. Reproduction, development and rearing of Penaeus japonicus Bate. Jap J Zool. 10, 305–393 (1942).
China Fishery Statistics Yearbook. Ministry of agriculture and rural affairs of the people’s republic of China. China Agriculture Press, Beijing (2024).
Behringer, D. C. & Duermit-Moreau, E. Crustaceans, one health and the changing ocean. J Invertebr Pathol. 186, 107500, https://doi.org/10.1016/j.jip.2020.107500 (2021).
Satam, H. et al. Next-generation sequencing technology: current trends and advancements. Biology. 13, 286, https://doi.org/10.3390/biology12070997 (2024).
Yuan, J. B. et al. Recent advances in crustacean genomics and their potential application in aquaculture. Rev Aquac. 15, 1501–1521, https://doi.org/10.1111/raq.12791 (2023).
Briones-Fourzan, P. & Hendrickx, M. E. Ecology and diversity of marine Decapod crustaceans. Diversity. 14, 614, https://doi.org/10.3390/d14080614 (2022).
Zhang, X. J. et al. Penaeid shrimp genome provides insights into benthic adaptation and frequent molting. Nat Commun. 10, 356, https://doi.org/10.1038/s41467-018-08197-4 (2019).
Uengwetwanit, T. et al. A chromosome–level assembly of the black tiger shrimp (Penaeus monodon) genome facilitates the identification of growth–associated genes. Mol Ecol Resour. 21, 1620–1640, https://doi.org/10.1111/1755-0998.13357 (2021).
Yuan, J. B. et al. Simple sequence repeats drive genome plasticity and promote adaptive evolution in penaeid shrimp. Commun biol. 4, 186, https://doi.org/10.1038/s42003-021-01716-y (2021).
Qi, H. G. et al. Construction and analysis of the chromosome-level haplotype-resolved genomes of two Crassostrea oyster congeners: Crassostrea angulata and Crassostrea gigas. Gigascience. 12, giad077, https://doi.org/10.1093/gigascience/giad077 (2023).
Kawato, S. et al. Genome and transcriptome assemblies of the kuruma shrimp, Marsupenaeus japonicus. G3. 11, jkab268, https://doi.org/10.1093/g3journal/jkab268 (2021).
Ren, X. Y. et al. A chromosome-level genome of the kuruma shrimp (Marsupenaeus japonicus) provides insights into its evolution and cold-resistance mechanism. Genomics. 114, 110373, https://doi.org/10.1016/j.ygeno.2022.110373 (2022).
Hayashi, K. I. & Fujiwara, Y. A new method for obtaining metaphase chromosomes from the regeneration blastema of Penaeus (Marsupenaeus) japonicus. Nippon Suisan Gakk. 54, 1563–1565, https://doi.org/10.2331/suisan.54.1563 (1988).
Xiang, J. H., Liu, R. Y. & Zhou, L. H. Chromosomes of marine shrimps with special reference to different techniques. Aquaculture. 111, 321 (1993).
Zhang, X. J. et al. Penaeid shrimp chromosome studies entering the post–genomic era. Genes. 14, 2050, https://doi.org/10.3390/genes14112050 (2023).
Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 159, 1665–1680, https://doi.org/10.1016/j.cell.2014.11.021 (2014).
Liu, B. et al. Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects. Quantitative Biology. 35, 62–67, https://doi.org/10.48550/arXiv.1308.2012 (2013).
Li, R. et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20, 265–272, https://doi.org/10.1101/gr.097261.109 (2010).
Ruan, J. & Li, H. Fast and accurate long–read assembly with wtdbg2. Nat Methods. 17, 155–158, https://doi.org/10.1038/s41592-019-0669-3 (2020).
Hu, J. et al. NextPolish: a fast and efficient genome polishing tool for long–read assembly. Bioinformatics. 36, 2253–2255, https://doi.org/10.1093/bioinformatics/btz891 (2020).
Zhang, X. et al. Assembl y of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat Pla nts. 5, 833–845, https://doi.org/10.1038/s41477-019-0487-8 (2019).
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580, https://doi.org/10.1093/nar/27.2.573 (1999).
Jurka, J. et al. Repbase update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 110, 462–467, https://doi.org/10.1159/000084979 (2005).
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full–length LTR retrotransposons. Nucleic Acids Research. 35, W265–W268, https://doi.org/10.1093/nar/gkm286 (2007).
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics. 21, i351–i358, https://doi.org/10.1093/bioinformatics/bti1018 (2005).
Flynn, J. M. et al. RepeatModeler 2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci USA 117, 9451–9457, https://doi.org/10.1073/pnas.1921046117 (2020).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402, https://doi.org/10.1093/nar/25.17.3389 (1997).
Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995, https://doi.org/10.1101/gr.1865504 (2004).
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439, https://doi.org/10.1093/nar/gkl200 (2006).
Blanco, E., Parra, G. & Guigo, R. Using geneid to identify genes. Curr Protoc Bioinformatics. 18, 4.3.1–4.3.28, https://doi.org/10.1002/0471250953.bi0403s18 (2007).
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology. 268, 78–94, https://doi.org/10.1006/jmbi.1997.0951 (1997).
Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 20, 2878–2879, https://doi.org/10.1093/bioinformatics/bth315 (2004).
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7, https://doi.org/10.1186/gb-2008-9-1-r7 (2008).
Haas, B. J. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666, https://doi.org/10.1093/nar/gkg770 (2003).
Zheng, J. B. et al. Full-length transcriptome analysis provides new insights into the innate immune system of Marsupenaeus japonicus. Fish Shellfish Immun. 106, 283–295, https://doi.org/10.1016/j.fsi.2020.07.018 (2020).
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics. 10, 421, https://doi.org/10.1186/1471-2105-10-421 (2009).
Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240, https://doi.org/10.1093/bioinformatics/btu031 (2014).
Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48, https://doi.org/10.1093/nar/28.1.45 (2000).
Buchfink, B., Reuter, K. & Drost, H. G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat Methods. 18, 366–368, https://doi.org/10.1038/s41592-021-01101-x (2021).
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30, https://doi.org/10.1093/nar/28.1.27 (2000).
Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 5, 955–964, https://doi.org/10.1093/nar/25.5.955 (1997).
Nawrocki, E. P., Kolbe, D. L. & Eddy, S. R. Infernal 1.0: inference of RNA alignments. Bioinformatics. 25, 1335–1337, https://doi.org/10.1093/bioinformatics/btp157 (2009).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238, https://doi.org/10.1186/s13059-019-1832-y (2019).
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797, https://doi.org/10.1093/nar/gkh340 (2004).
Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 22, 2688–2690, https://doi.org/10.1093/bioinformatics/btl446 (2006).
NCBI Sequence Read Archive. https://identifiers.org/ncbi/insdc.sra:SRP581623 (2025).
ENA European Nucleotide Archive https://identifiers.org/ena.embl:PRJEB102642 (2025).
Xu, H. Chromosome-level genome assembly and annotation of the kuruma shrimp (Marsupenaeus japonicus). figshare https://doi.org/10.6084/m9.figshare.28874273.v1 (2025).
Manni, M. et al. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 38, 4647–4654, https://doi.org/10.1093/molbev/msab199 (2021).
Parra, G., Bradnam, K. & Korf, L. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 23, 1061–1067, https://doi.org/10.1093/bioinformatics/btm071 (2007).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 25, 1754–1760, https://doi.org/10.1093/bioinformatics/btp324 (2009).
Acknowledgements
This work was supported by the National Key R&D Program of China (2024YFD2401703), Fujian Special Fund for the Development of Marine and Fishery (FJHYF-L-2025-08), China Agriculture Research System (CARS-48), Fujian Special Fund for the Development of Marine and Fishery (FJHYF-ZH-2023-04), and Fujian Provincial Science and Technology Planning Projects (2022L3001).
Author information
Authors and Affiliations
Contributions
Y.W., H.X., H.H., S.D. and Y.M. conceived this project; Y.W., H.X., Z.Z., P.W. and W.C. collected the samples and performed the experiments; Y.W., H.X., Z.Z., P.W., W.C., H.H., S.D. and Y.M. performed the research and analyzed the data; Y.W., H.X., S.D. and Y.M. drafted the manuscript. All authors have read and approved the final manuscript for publication.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Reprints and permissions
About this article
Cite this article
Wei, Y., Xu, H., Zhou, Z. et al. Chromosome-level genome assembly and annotation of the kuruma shrimp Marsupenaeus japonicus.
Sci Data (2025). https://doi.org/10.1038/s41597-025-06317-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-06317-6
Source: Ecology - nature.com
