in

Chromosome-level genome assembly and annotation of the kuruma shrimp Marsupenaeus japonicus


Abstract

The kuruma shrimp Marsupenaeus japonicus is one of the most economically important shrimp species in the world. Here, we constructed a chromosome-level genome assembly of M. japonicus by combining PacBio long reads, Illumina short reads and Hi-C scaffolding. The genome size was 1.64 Gb with a scaffold N50 length of 40.61 Mb, and 97.83% (1.60 Gb) of the assembled sequences were anchored to 43 chromosomes. The genome contained 62.79% repeat sequences and 21,172 protein-coding genes, of which 83.20% were functionally annotated. The completeness of M. japonicus genome assembly is highlighted by a BUSCO score of 91.0%. Evolutionary analysis indicated that M. japonicus was closely related to Litopenaeus vannamei and Penaeus monodon, with an estimated divergence time from their common ancestor of 88.33 million years ago. In sum, the chromosome-level genome assembly and annotation provide fundamental resources for genetic breeding and molecular mechanism related studies of M. japonicus.

Similar content being viewed by others

A high-quality chromosome-level genome assembly of Pacific whiteleg shrimp (Penaeus vannamei)

Chromosome-level genome assembly of ridgetail white shrimp Exopalaemon carinicauda

A high-quality chromosome-level genome assembly and annotation of the giant freshwater prawn (Macrobrachium rosenbergii)

Data availability

All the raw sequencing data used for genome assembly were deposited in the NCBI Sequence Read Archive (SRA) database under the accession number SRP58162351. The chromosome-level assembly of the M. japonicus genome was deposited in the European Nucleotide Archive (ENA) under the accession number PRJEB10264252. The genome annotation files were deposited at the figshare (https://doi.org/10.6084/m9.figshare.28874273.v1)53.

Code availability

All commands and pipelines used in data processing were executed according to the manual and protocols of the corresponding bioinformatic software. No specific code has been developed for this study.

References

  1. Tsoi, K. H. et al. Verification of the cryptic species Penaeus pulchricaudatus in the commercially important kuruma shrimp P. japonicus (Decapoda, Penaeidae) using molecular taxonomy. Invertebr Syst. 28, 476–490, https://doi.org/10.1071/IS14001 (2014).

    Google Scholar 

  2. Tsoi, K. H., Chan, T. Y. & Chu, K. H. Molecular population structure of the kuruma shrimp Penaeus japonicus species complex in western Pacific. Mar Biol. 150, 1345–1364, https://doi.org/10.1007/s00227-006-0426-x (2007).

    Google Scholar 

  3. Duan, Y. F. et al. Effect of desiccation and resubmersion on the oxidative stress response of the kuruma shrimp Marsupenaeus japonicus. Fish shellfish Immun. 49, 91–99, https://doi.org/10.1016/j.fsi.2015.12.018 (2016).

    Google Scholar 

  4. Wang, P. P. et al. Air exposure affects physiological responses, innate immunity, apoptosis and DNA methylation of kuruma shrimp, Marsupenaeus japonicus. Front Physiol. 11, 223, https://doi.org/10.3389/fphys.2020.00223 (2020).

    Google Scholar 

  5. Francis, B. et al. Reproductive performance, salinity tolerance, growth and production performance of a cryptic species Penaeus (Marsupenaeus) japonicus. Aquac Res. 52, 5506–5516, https://doi.org/10.1111/are.15424 (2021).

    Google Scholar 

  6. Hudinaga, M. Reproduction, development and rearing of Penaeus japonicus Bate. Jap J Zool. 10, 305–393 (1942).

    Google Scholar 

  7. China Fishery Statistics Yearbook. Ministry of agriculture and rural affairs of the people’s republic of China. China Agriculture Press, Beijing (2024).

  8. Behringer, D. C. & Duermit-Moreau, E. Crustaceans, one health and the changing ocean. J Invertebr Pathol. 186, 107500, https://doi.org/10.1016/j.jip.2020.107500 (2021).

    Google Scholar 

  9. Satam, H. et al. Next-generation sequencing technology: current trends and advancements. Biology. 13, 286, https://doi.org/10.3390/biology12070997 (2024).

    Google Scholar 

  10. Yuan, J. B. et al. Recent advances in crustacean genomics and their potential application in aquaculture. Rev Aquac. 15, 1501–1521, https://doi.org/10.1111/raq.12791 (2023).

    Google Scholar 

  11. Briones-Fourzan, P. & Hendrickx, M. E. Ecology and diversity of marine Decapod crustaceans. Diversity. 14, 614, https://doi.org/10.3390/d14080614 (2022).

    Google Scholar 

  12. Zhang, X. J. et al. Penaeid shrimp genome provides insights into benthic adaptation and frequent molting. Nat Commun. 10, 356, https://doi.org/10.1038/s41467-018-08197-4 (2019).

    Google Scholar 

  13. Uengwetwanit, T. et al. A chromosome–level assembly of the black tiger shrimp (Penaeus monodon) genome facilitates the identification of growth–associated genes. Mol Ecol Resour. 21, 1620–1640, https://doi.org/10.1111/1755-0998.13357 (2021).

    Google Scholar 

  14. Yuan, J. B. et al. Simple sequence repeats drive genome plasticity and promote adaptive evolution in penaeid shrimp. Commun biol. 4, 186, https://doi.org/10.1038/s42003-021-01716-y (2021).

    Google Scholar 

  15. Qi, H. G. et al. Construction and analysis of the chromosome-level haplotype-resolved genomes of two Crassostrea oyster congeners: Crassostrea angulata and Crassostrea gigas. Gigascience. 12, giad077, https://doi.org/10.1093/gigascience/giad077 (2023).

    Google Scholar 

  16. Kawato, S. et al. Genome and transcriptome assemblies of the kuruma shrimp, Marsupenaeus japonicus. G3. 11, jkab268, https://doi.org/10.1093/g3journal/jkab268 (2021).

    Google Scholar 

  17. Ren, X. Y. et al. A chromosome-level genome of the kuruma shrimp (Marsupenaeus japonicus) provides insights into its evolution and cold-resistance mechanism. Genomics. 114, 110373, https://doi.org/10.1016/j.ygeno.2022.110373 (2022).

    Google Scholar 

  18. Hayashi, K. I. & Fujiwara, Y. A new method for obtaining metaphase chromosomes from the regeneration blastema of Penaeus (Marsupenaeus) japonicus. Nippon Suisan Gakk. 54, 1563–1565, https://doi.org/10.2331/suisan.54.1563 (1988).

    Google Scholar 

  19. Xiang, J. H., Liu, R. Y. & Zhou, L. H. Chromosomes of marine shrimps with special reference to different techniques. Aquaculture. 111, 321 (1993).

    Google Scholar 

  20. Zhang, X. J. et al. Penaeid shrimp chromosome studies entering the post–genomic era. Genes. 14, 2050, https://doi.org/10.3390/genes14112050 (2023).

    Google Scholar 

  21. Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 159, 1665–1680, https://doi.org/10.1016/j.cell.2014.11.021 (2014).

    Google Scholar 

  22. Liu, B. et al. Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects. Quantitative Biology. 35, 62–67, https://doi.org/10.48550/arXiv.1308.2012 (2013).

    Google Scholar 

  23. Li, R. et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20, 265–272, https://doi.org/10.1101/gr.097261.109 (2010).

    Google Scholar 

  24. Ruan, J. & Li, H. Fast and accurate long–read assembly with wtdbg2. Nat Methods. 17, 155–158, https://doi.org/10.1038/s41592-019-0669-3 (2020).

    Google Scholar 

  25. Hu, J. et al. NextPolish: a fast and efficient genome polishing tool for long–read assembly. Bioinformatics. 36, 2253–2255, https://doi.org/10.1093/bioinformatics/btz891 (2020).

    Google Scholar 

  26. Zhang, X. et al. Assembl y of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat Pla nts. 5, 833–845, https://doi.org/10.1038/s41477-019-0487-8 (2019).

    Google Scholar 

  27. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580, https://doi.org/10.1093/nar/27.2.573 (1999).

    Google Scholar 

  28. Jurka, J. et al. Repbase update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 110, 462–467, https://doi.org/10.1159/000084979 (2005).

    Google Scholar 

  29. Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full–length LTR retrotransposons. Nucleic Acids Research. 35, W265–W268, https://doi.org/10.1093/nar/gkm286 (2007).

    Google Scholar 

  30. Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics. 21, i351–i358, https://doi.org/10.1093/bioinformatics/bti1018 (2005).

    Google Scholar 

  31. Flynn, J. M. et al. RepeatModeler 2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci USA 117, 9451–9457, https://doi.org/10.1073/pnas.1921046117 (2020).

    Google Scholar 

  32. Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402, https://doi.org/10.1093/nar/25.17.3389 (1997).

    Google Scholar 

  33. Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995, https://doi.org/10.1101/gr.1865504 (2004).

    Google Scholar 

  34. Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439, https://doi.org/10.1093/nar/gkl200 (2006).

    Google Scholar 

  35. Blanco, E., Parra, G. & Guigo, R. Using geneid to identify genes. Curr Protoc Bioinformatics. 18, 4.3.1–4.3.28, https://doi.org/10.1002/0471250953.bi0403s18 (2007).

    Google Scholar 

  36. Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology. 268, 78–94, https://doi.org/10.1006/jmbi.1997.0951 (1997).

    Google Scholar 

  37. Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 20, 2878–2879, https://doi.org/10.1093/bioinformatics/bth315 (2004).

    Google Scholar 

  38. Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7, https://doi.org/10.1186/gb-2008-9-1-r7 (2008).

    Google Scholar 

  39. Haas, B. J. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666, https://doi.org/10.1093/nar/gkg770 (2003).

    Google Scholar 

  40. Zheng, J. B. et al. Full-length transcriptome analysis provides new insights into the innate immune system of Marsupenaeus japonicus. Fish Shellfish Immun. 106, 283–295, https://doi.org/10.1016/j.fsi.2020.07.018 (2020).

    Google Scholar 

  41. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics. 10, 421, https://doi.org/10.1186/1471-2105-10-421 (2009).

    Google Scholar 

  42. Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240, https://doi.org/10.1093/bioinformatics/btu031 (2014).

    Google Scholar 

  43. Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48, https://doi.org/10.1093/nar/28.1.45 (2000).

    Google Scholar 

  44. Buchfink, B., Reuter, K. & Drost, H. G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat Methods. 18, 366–368, https://doi.org/10.1038/s41592-021-01101-x (2021).

    Google Scholar 

  45. Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30, https://doi.org/10.1093/nar/28.1.27 (2000).

    Google Scholar 

  46. Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 5, 955–964, https://doi.org/10.1093/nar/25.5.955 (1997).

    Google Scholar 

  47. Nawrocki, E. P., Kolbe, D. L. & Eddy, S. R. Infernal 1.0: inference of RNA alignments. Bioinformatics. 25, 1335–1337, https://doi.org/10.1093/bioinformatics/btp157 (2009).

    Google Scholar 

  48. Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238, https://doi.org/10.1186/s13059-019-1832-y (2019).

    Google Scholar 

  49. Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797, https://doi.org/10.1093/nar/gkh340 (2004).

    Google Scholar 

  50. Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 22, 2688–2690, https://doi.org/10.1093/bioinformatics/btl446 (2006).

    Google Scholar 

  51. NCBI Sequence Read Archive. https://identifiers.org/ncbi/insdc.sra:SRP581623 (2025).

  52. ENA European Nucleotide Archive https://identifiers.org/ena.embl:PRJEB102642 (2025).

  53. Xu, H. Chromosome-level genome assembly and annotation of the kuruma shrimp (Marsupenaeus japonicus). figshare https://doi.org/10.6084/m9.figshare.28874273.v1 (2025).

  54. Manni, M. et al. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 38, 4647–4654, https://doi.org/10.1093/molbev/msab199 (2021).

    Google Scholar 

  55. Parra, G., Bradnam, K. & Korf, L. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 23, 1061–1067, https://doi.org/10.1093/bioinformatics/btm071 (2007).

    Google Scholar 

  56. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 25, 1754–1760, https://doi.org/10.1093/bioinformatics/btp324 (2009).

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Key R&D Program of China (2024YFD2401703), Fujian Special Fund for the Development of Marine and Fishery (FJHYF-L-2025-08), China Agriculture Research System (CARS-48), Fujian Special Fund for the Development of Marine and Fishery (FJHYF-ZH-2023-04), and Fujian Provincial Science and Technology Planning Projects (2022L3001).

Author information

Authors and Affiliations

Authors

Contributions

Y.W., H.X., H.H., S.D. and Y.M. conceived this project; Y.W., H.X., Z.Z., P.W. and W.C. collected the samples and performed the experiments; Y.W., H.X., Z.Z., P.W., W.C., H.H., S.D. and Y.M. performed the research and analyzed the data; Y.W., H.X., S.D. and Y.M. drafted the manuscript. All authors have read and approved the final manuscript for publication.

Corresponding authors

Correspondence to
Shaoxiong Ding or Yong Mao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wei, Y., Xu, H., Zhou, Z. et al. Chromosome-level genome assembly and annotation of the kuruma shrimp Marsupenaeus japonicus.
Sci Data (2025). https://doi.org/10.1038/s41597-025-06317-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41597-025-06317-6


Source: Ecology - nature.com

Segmentation of plateau zokor mounds in alpine meadows from UAV images using an improved UNet network

Aeolian sand migration induced land degradation and desertification hotspots identification in the semi-arid rain shadow regions of Anantapur, India

Back to Top