in

Omics exploration of deep-sea biodiversity: data from the “Pourquoi Pas les Abysses?” and eDNAbyss projects


Abstract

The deep-sea floor encompasses more than half of the surface of our planet, yet the extent and distribution of deep-sea biodiversity and its contribution to large biogeochemical cycles remain poorly understood. This knowledge gap stems from several factors, including sampling issues, the magnitude of the work required for morphological inventories, and the difficulty of integrating results from disparate local studies. The application of meta-omics to environmental DNA now makes it possible to assemble interoperable datasets at different spatial scales to move towards a global assessment of deep-sea biodiversity. We present a large-scale dataset on deep-sea biodiversity, with data and metadata openly accessible at ENA and Zenodo. The resource was generated using standardized protocols developed according to FAIR principles, covering fieldwork through bioinformatic analysis, within “Pourquoi Pas les Abysses?” and eDNAbyss projects. Together with information ensuring reproducibility, this dataset —combining metagenomics, metabarcoding across the Tree of Life and capture-by-hybridization— contributes to the international concerted effort to achieve a holistic view of the biodiversity in the largest biome on Earth.

Similar content being viewed by others

North Atlantic deep-sea benthic biodiversity unveiled through sponge natural sampler DNA

Seasonal microbial dynamics in the ocean inferred from assembled and unassembled data: a view on the unknown biosphere

Disparate genetic divergence patterns in three corals across a pan-Pacific environmental gradient highlight species-specific adaptation

Data availability

The global dataset has been deposited in European Nucleotide Archive (ENA) as project PRJEB 39225 (https://www.ebi.ac.uk/ena/browser/view/PRJEB39225), with metadata available on Zenodo (https://zenodo.org/records/6815677).

Code availability

1. Time Analysis software: https://www.illumina.com/search.html?filter=support&q=RTA%20download&p=1.

2. Bcl2fastq Conversion: https://support.illumina.com/downloads/bcl2fastq-conversion-software-v2-20.html

3. Cutadapt, https://github.com/marcelm/cutadapt/releases/tag/v1.18

4. Fastx_clean software, http://www.genoscope.cns.fr/fastxtend

5. FASTX-Toolkit, http://hannonlab.cshl.edu/fastx_toolkit/index.html

6. SortMeRNA v2.1, https://github.com/biocore/sortmerna

7. fastx_estimate_duplicate software, http://www.genoscope.cns.fr/fastxtend

8. fastx_mergepairs software, http://www.genoscope.cns.fr/fastxtend

9. Usearch, https://www.drive5.com/usearch/

10. Trimmomatic: https://github.com/usadellab/Trimmomatic

11. Decontam: https://github.com/benjjneb/decontam

12. Prinseq: https://github.com/uwb-linux/prinseq

13. Qiime2 feature classifier: https://github.com/qiime2/q2-feature-classifier

14. FastQC: https://github.com/s-andrews/FastQC

15. BBTools: https://github.com/kbaseapps/BBTools

16. MultiQC: https://github.com/MultiQC/MultiQC

17. MetaRib: https://github.com/yxxue/MetaRib

18. EMIRGE: https://github.com/csmiller/EMIRGE

19. VSearch: https://github.com/torognes/vsearch

20. 1IDBA_UD: https://github.com/1928d/idba_ud

21. CAP3: https://faculty.sites.iastate.edu/xqhuang/cap3-and-pcap-sequence-and-genome-assemblyprograms

22. eDNAbyss pipeline: https://gitlab.ifremer.fr/abyss-project/

23. MUMU algorithm: https://github.com/frederic-mahe/mumu

24. bbmap: https://sourceforge.net/projects/bbmap/

26. RiboTaxa: https://github.com/oschakoory/RiboTaxa

27. eDNAbyss pipeline(s): https://gitlab.ifremer.fr/abyss-project/),

References

  1. Ramirez-Llodra, E. et al. Deep, diverse and definitely different: unique attributes of the world’s largest ecosystem. Biogeosciences 7, 2851–2899 (2010).

    Google Scholar 

  2. Levin, L. A. et al. Deep-sea impacts of climate interventions. Science 379, 978–981 (2023).

    Google Scholar 

  3. Paulus, E. Shedding light on deep-sea biodiversity—a highly vulnerable habitat in the face of anthropogenic change. Front. Mar. Sci. 8, 667048 (2021).

    Google Scholar 

  4. Sanders, H. L., Hessler, R. R. & Hampson, G. R. An introduction to the study of deep-sea benthic faunal assemblages along the Gay Head-Bermuda transect. Deep Sea Res. Oceanogr. Abstr. 12, 845–848 (1965).

    Google Scholar 

  5. Hessler, R. R. & Sanders, H. L. Faunal diversity in the deep-sea. Deep Sea Res. Oceanogr. Abstr. 14, 65–70 (1967).

    Google Scholar 

  6. Grassle, J. F. & Maciolek, N. J. Deep-sea species richness: regional and local diversity estimates from quantitative bottom samples. Am. Nat. 139, 313–341 (1992).

    Google Scholar 

  7. WoRMS Editorial Board. World register of marine species. https://www.marinespecies.org (2025).

  8. WoRDSS Editorial Board. World register of deep-sea species. https://www.deepseaspecies.org (2025).

  9. Gage, J. D. & May, R. M. A dip into the deep seas. Nature 365, 609–610 (1993).

    Google Scholar 

  10. Levin, L. A. et al. Environmental influences on regional deep-sea species diversity. Annu. Rev. Ecol. Syst. 32, 51–53 (2001).

    Google Scholar 

  11. Ramirez-Llodra, E. et al. Man and the last great wilderness: human impact on the deep sea. PLoS One 6, e22588 (2011).

    Google Scholar 

  12. Bell, K. L. C., Johannes, K. N., Kennedy, B. R. C. & Poulton, S. E. How little we’ve seen: a visual coverage estimate of the deep seafloor. Sci. Adv. 11, eadp8602 (2025).

    Google Scholar 

  13. Mejía-Saenz, A., Simon-Lledó, E., Partridge, L. S., Xavier, J. R. & Jones, D. O. B. Rock outcrops enhance abyssal benthic biodiversity. Deep Sea Res. I Oceanogr. Res. Pap. 195, 103999 (2023).

    Google Scholar 

  14. Simon-Lledó, E. et al. Carbonate compensation depth drives abyssal biogeography in the Northeast Pacific. Nat. Ecol. Evol. 7, 1388–1397 (2023).

    Google Scholar 

  15. Smith, C. R., Clark, M. R., Goetze, E., Glover, A. G. & Howell, K. L. Editorial: biodiversity, connectivity and ecosystem function across the clarion-clipperton zone: a regional synthesis for an area targeted for nodule mining. Front. Mar. Sci. 8, 797516 (2021).

    Google Scholar 

  16. Appeltans, W. et al. The magnitude of global marine species diversity. Curr. Biol. 22, 2189–2202 (2012).

    Google Scholar 

  17. Costello, M. J. & Chaudhary, C. Marine biodiversity, biogeography, deep-sea gradients, and conservation. Curr. Biol. 27, R511–R527 (2017).

    Google Scholar 

  18. Snelgrove, P. et al. The importance of marine sediment biodiversity in ecosystem processes. Ambio 26, 578–583 (1997).

    Google Scholar 

  19. McClain, C. R. & Hardy, S. M. The dynamics of biogeographic ranges in the deep sea. Proc. R. Soc. B Biol. Sci. 277, 3533–3546 (2010).

    Google Scholar 

  20. Valentine, J. W. & Jablonski, D. A twofold role for global energy gradients in marine biodiversity trends. J. Biogeogr. 42, 997–1005 (2015).

    Google Scholar 

  21. Danovaro, R., Snelgrove, P. V. & Tyler, P. Challenging the paradigms of deep-sea ecology. Trends Ecol. Evol. 29, 465–475 (2014).

    Google Scholar 

  22. Rex, M. A. & Etter, R. J. Deep-Sea Biodiversity: Pattern and Scale (Harvard Univ. Press, 2010).

  23. Gauthier, O., Sarrazin, J. & Desbruyères, D. Measure and mis-measure of species diversity in deep-sea chemosynthetic communities. Mar. Ecol. Prog. Ser. 402, 285–302 (2010).

    Google Scholar 

  24. Holman, L. E. et al. Detection of introduced and resident marine species using environmental DNA metabarcoding of sediment and water. Sci. Rep. 9, 11559 (2019).

    Google Scholar 

  25. Ji, Y. et al. Reliable, verifiable and efficient monitoring of biodiversity via metabarcoding. Ecol. Lett. 16, 1245–1257 (2013).

    Google Scholar 

  26. Laroche, O., Kersten, O., Smith, C. R. & Goetze, E. Environmental DNA surveys detect distinct metazoan communities across abyssal plains and seamounts in the Western Clarion Clipperton zone. Mol. Ecol. Resour. 29, 4588–4604 (2020).

    Google Scholar 

  27. Sinniger, F. et al. Worldwide analysis of sedimentary DNA reveals major gaps in taxonomic knowledge of deep-sea benthos. Front. Mar. Sci. 3, 92 (2016).

    Google Scholar 

  28. Karsenti, E. et al. A holistic approach to marine eco-systems biology. PLoS Biol. 9, e1001177 (2011).

    Google Scholar 

  29. De Vargas, C. et al. Ocean plankton. Eukaryotic plankton diversity in the sunlit ocean. Science 348, 1261605 (2015).

    Google Scholar 

  30. Lima-Mendez, G. et al. Ocean plankton. Determinants of community structure in the global plankton interactome. Science 348, 1262073 (2015).

    Google Scholar 

  31. Salazar, G. et al. Global diversity and biogeography of deep-sea pelagic prokaryotes. ISME J. 10, 596–608 (2016).

    Google Scholar 

  32. Brandt, M. I. et al. Evaluating sediment and water sampling methods for the estimation of deep-sea biodiversity using environmental DNA. Sci. Rep. 11, 7856 (2021).

    Google Scholar 

  33. Brandt, M. I. et al. An assessment of environmental metabarcoding protocols aiming at favoring contemporary biodiversity in inventories of deep-sea communities. Front. Mar. Sci. 7, 234 (2020).

    Google Scholar 

  34. Gunther, B. et al. Capture by hybridization for full-length barcode-based eukaryotic and prokaryotic biodiversity inventories of deep sea ecosystems. Mol. Ecol. Resour. 22, 623–637 (2021).

    Google Scholar 

  35. Brandt, M. I. et al. Bioinformatic pipelines combining denoising and clustering tools allow for more comprehensive prokaryotic and eukaryotic metabarcoding. Mol. Ecol. Resour. 21, 1904–1921 (2021).

    Google Scholar 

  36. Cordier, T. et al. Patterns of eukaryotic diversity from the surface to the deep-ocean sediment. Sci. Adv. 8, eabj9309 (2022).

    Google Scholar 

  37. ENA European Nucleotide Archive. Project: PRJEB39225. https://identifiers.org/ena.embl:PRJEB39225 (2025).

  38. Bett, B. J. et al. Sampler bias in the quantitative study of deep-sea meiobenthos. Mar. Ecol. Prog. Ser. 104, 197–203 (1994).

    Google Scholar 

  39. Schauberger, C. et al. Microbial community structure in hadal sediments: high similarity along trench axes and strong changes along redox gradients. ISME J. 15, 3455–3467 (2021).

    Google Scholar 

  40. Thamdrup, B. et al. Anammox bacteria drive fixed nitrogen loss in hadal trench sediments. Proc. Natl. Acad. Sci. USA. 118, e2104529118 (2021).

    Google Scholar 

  41. Armbrecht, L. H. et al. Ancient DNA from marine sediments: precautions and considerations for seafloor coring, sample handling and data generation. Earth Sci. Rev. 196, 102887 (2019).

    Google Scholar 

  42. Lejzerowicz, F. et al. Ancient DNA complements microfossil record in deep-sea subsurface sediments. Biol. Lett. 9, 20130283 (2013).

    Google Scholar 

  43. Stewart, H. A. & Jamieson, A. J. Habitat heterogeneity of hadal trenches: considerations and implications for future studies. Prog. Oceanogr. 161, 47–65 (2018).

    Google Scholar 

  44. Trouche, B. et al. Distribution and genomic variation of ammonia-oxidizing archaea in abyssal and hadal surface sediments. ISME Commun. 3, 133 (2023).

    Google Scholar 

  45. Schauberger, C. et al. Metagenome-assembled genomes of deep-sea sediments: changes in microbial functional potential lag behind redox transitions. ISME Commun. 4, ycad005 (2024).

    Google Scholar 

  46. Cosson, N., Sibuet, M. & Galeron, J. Community structure and spatial heterogeneity of the deep-sea macrofauna at three contrasting stations in the tropical Northeast Atlantic. Deep Sea Res. I Oceanogr. Res. Pap. 44, 247–269 (1997).

    Google Scholar 

  47. Vincx, M. et al. in Advances in Marine Biology (eds. Blaxter, J. H. S. & Southward, A. J.) 1–88 (Academic Press, 1994).

  48. Lins, L. et al. Toward a reliable assessment of potential ecological impacts of deep-sea polymetallic nodule mining on abyssal infauna. Limnol. Oceanogr. Methods 19, 626–650 (2021).

    Google Scholar 

  49. Soto, E. H. et al. Temporal variability in polychaete assemblages of the abyssal NE Atlantic Ocean. Deep Sea Res. II Top. Stud. Oceanogr. 57, 1396–1405 (2010).

    Google Scholar 

  50. Nomaki, H. et al. Abyssal fauna, benthic microbes, and organic matter quality across a range of trophic conditions in the Western Pacific ocean. Prog. Oceanogr. 195, 102591 (2021).

    Google Scholar 

  51. Good, E. et al. Detection of community-wide impacts of bottom trawl fishing on deep-sea assemblages using environmental DNA metabarcoding. Mar. Pollut. Bull. 183, 114062 (2022).

    Google Scholar 

  52. Vanhove, S., Vermeeren, H. & Vanreusel, A. Meiofauna towards the South Sandwich Trench (750–6300m), focus on nematodes. Deep Sea Res. II Top. Stud. Oceanogr. 51, 1665–1687 (2004).

    Google Scholar 

  53. Narayanaswamy, B. et al. in Biological Sampling in the Deep-Sea (eds. Clark, M. R., Consalvey, M. & Rowden, A. A.) 207–227 (Blackwell Publishing, 2016).

  54. Sarrazin, J. & Bignon, L. A new tool to sample hard substratum faunal communities in the deep sea.

  55. Cowart, D. A., Matabos, M., Brandt, M. I., Marticorena, J. & Sarrazin, J. Exploring environmental DNA (eDNA) to assess biodiversity of hard substratum faunal communities on the lucky strike vent field (Mid-Atlantic ridge) and investigate recolonization dynamics after an induced disturbance. Front. Mar. Sci. 6, 783 (2020).

    Google Scholar 

  56. Roussel, E. G. et al. Comparison of microbial communities associated with three Atlantic ultramafic hydrothermal systems. FEMS Microbiol. Ecol. 77, 647–665 (2011).

    Google Scholar 

  57. Assis, J. et al. Bio-ORACLE v2.0: extending marine data layers for bioclimatic modelling. Glob. Ecol. Biogeogr. 27, 277–284 (2018).

    Google Scholar 

  58. Tyberghein, L. et al. Bio-ORACLE: a global environmental dataset for marine species distribution modelling. Glob. Ecol. Biogeogr. 21, 272–281 (2012).

    Google Scholar 

  59. Sassoubre, L. M., Yamahara, K. M., Gardner, L. D., Block, B. A. & Boehm, A. B. Quantification of environmental DNA (eDNA) shedding and decay rates for three marine fish. Environ. Sci. Technol. 50, 10456–10464 (2016).

    Google Scholar 

  60. Andruszkiewicz, E. A., Sassoubre, L. M. & Boehm, A. B. Persistence of marine fish environmental DNA and the influence of sunlight. PLoS One 12, e0185043 (2017).

    Google Scholar 

  61. Wei, N., Nakajima, F. & Tobino, T. A microcosm study of surface sediment environmental DNA: decay observation, abundance estimation, and fragment length comparison. Environ. Sci. Technol. 52, 12428–12435 (2018).

    Google Scholar 

  62. Mauvisseau, Q. et al. The multiple states of environmental DNA and what is known about their persistence in aquatic environments. Environ. Sci. Technol. 56, 5322–5333 (2022).

    Google Scholar 

  63. Corinaldesi, C., Barucca, M., Luna, G. M. & Dell’Anno, A. Preservation, origin and genetic imprint of extracellular DNA in permanently anoxic deep-sea sediments. Mol. Ecol. 20, 642–654 (2011).

    Google Scholar 

  64. Armbrecht, L. et al. An optimized method for the extraction of ancient eukaryote DNA from marine sediments. Mol. Ecol. Resour. 20, 906–919 (2020).

    Google Scholar 

  65. Siano, R. et al. Sediment archives reveal irreversible shifts in plankton communities after World War II and agricultural pollution. Curr. Biol. 31, 2682–2689.e7 (2021).

    Google Scholar 

  66. Kirkpatrick, J. B., Walsh, E. A. & D’Hondt, S. Fossil DNA persistence and decay in marine sediment over hundred-thousand-year to million-year time scales. Geology 44, 615–618 (2016).

    Google Scholar 

  67. Lennon, J. T., Muscarella, M. E., Placella, S. A. & Lehmkuhl, B. K. How, when, and where relic DNA affects microbial diversity. mBio 9, e00637–18 (2018).

    Google Scholar 

  68. Armbrecht, L. et al. Ancient marine sediment DNA reveals diatom transition in Antarctica. Nat. Commun. 13, 5787 (2022).

    Google Scholar 

  69. Orsi, W., Biddle, J. F. & Edgcomb, V. Deep sequencing of subseafloor eukaryotic rRNA reveals active fungi across marine subsurface provinces. PLoS One 8, e56335 (2013).

    Google Scholar 

  70. Cristescu, M. Can environmental RNA revolutionize biodiversity science? Trends Ecol. Evol. 34, 694–697 (2019).

    Google Scholar 

  71. Goldberg, C. S. et al. Critical considerations for the application of environmental DNA methods to detect aquatic species. Methods Ecol. Evol. 7, 1299–1307 (2016).

    Google Scholar 

  72. Alberti, A. et al. Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition. Sci. Data 4, 1–20 (2017).

    Google Scholar 

  73. Belser, C. et al. Integrative omics framework for characterization of coral reef ecosystems from the Tara Pacific expedition. Sci. Data 10, 326 (2023).

    Google Scholar 

  74. Leray, M. et al. A new versatile primer set targeting a short fragment of the mitochondrial COI region for metabarcoding metazoan diversity: application for characterizing coral reef fish gut contents. Front. Zool. 10, 34 (2013).

    Google Scholar 

  75. Stoeck, T. et al. Massively parallel tag sequencing reveals the complexity of anaerobic marine protistan communities. BMC Biol. 7, 72 (2009).

    Google Scholar 

  76. Amaral-Zettler, L. A., McCliment, E. A., Ducklow, H. W. & Huse, S. M. A method for studying protistan diversity using massively parallel sequencing of V9 hypervariable regions of small-subunit ribosomal RNA genes. PLoS One 4, e6372 (2009).

    Google Scholar 

  77. Parada, A. E., Needham, D. M. & Fuhrman, J. A. Every base matters: assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples. Environ. Microbiol. 18, 1403–1414 (2016).

    Google Scholar 

  78. Topcuoglu, B. D. et al. Hydrogen limitation and syntrophic growth among natural assemblages of thermophilic methanogens at deep-sea hydrothermal vents. Front. Microbiol. 7, 1240 (2016).

    Google Scholar 

  79. Gasc, C., Peyretaillade, E. & Peyret, P. Sequence capture by hybridization to explore modern and ancient genomic diversity in model and nonmodel organisms. Nucleic Acids Res. 44, 4504–4518 (2016).

    Google Scholar 

  80. Parisot, N., Denonfoux, J., Dugat-Bony, E., Peyret, P. & Peyretaillade, E. KASpOD–a web service for highly specific and explorative oligonucleotide design. Bioinformatics 28, 3161–3162 (2012).

    Google Scholar 

  81. Marre, S. et al. Revealing microbial species diversity using sequence capture by hybridization. Microb. Genom. 7, 000714 (2021).

    Google Scholar 

  82. Comtet-Marre, S., Chakoory, O. & Peyret, P. Targeted 16S rRNA gene capture by hybridization and bioinformatic analysis. Methods Mol. Biol. 2605, 187–208 (2023).

    Google Scholar 

  83. Ribiere, C. et al. Targeted gene capture by hybridization to illuminate ecosystem functioning. Methods Mol. Biol. 1399, 167–182 (2016).

    Google Scholar 

  84. Jaziri, F. et al. PhylOPDb: a 16S rRNA oligonucleotide probe database for prokaryotic identification. Database (Oxford) 2014, bau036 (2014).

    Google Scholar 

  85. Militon, C. et al. PhylArray: phylogenetic probe design algorithm for microarray. Bioinformatics 23, 2550–2557 (2007).

    Google Scholar 

  86. Machida, R. J., Leray, M., Ho, S. L. & Knowlton, N. Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples. Sci. Data 4, 170027 (2017).

    Google Scholar 

  87. Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).

    Google Scholar 

  88. Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).

    Google Scholar 

  89. Guillou, L. et al. The protist ribosomal reference database (PR2): a catalog of unicellular eukaryote small sub-unit rRNA sequences with curated taxonomy. Nucleic Acids Res. 41, D597–D604 (2013).

    Google Scholar 

  90. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Google Scholar 

  91. Schmieder, R. & Edwards, R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011).

    Google Scholar 

  92. Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).

    Google Scholar 

  93. Jacobsen, A. et al. FAIR principles: interpretations and implementation considerations. Data Intell. 2, 10–29 (2020).

    Google Scholar 

  94. Pesant, S. et al. eDNAbyss samples provenance and environmental context – version 1 (version 1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6815677 (2022).

  95. Callahan, B. J. et al. DADA2: high-resolution sample inference from illumina amplicon data. Nat. Methods 13, 581–583 (2016).

    Google Scholar 

  96. Mahe, F., Rognes, T., Quince, C., De Vargas, C. & Dunthorn, M. Swarm v2: highly-scalable and high-resolution amplicon clustering. PeerJ 3, e1420 (2015).

    Google Scholar 

  97. Mahé, F. MUMU: post-clustering curation tool for metabarcoding data, version 1.0.2. https://github.com/frederic-mahe/mumu (2023).

  98. Frøslev, T. G. et al. Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates. Nat. Commun. 8, 1188 (2017).

    Google Scholar 

  99. Blaxter, M. et al. Defining operational taxonomic units using DNA barcode data. Philos. Trans. R. Soc. B Biol. Sci. 360, 1935–1943 (2005).

    Google Scholar 

  100. Callahan, B. J., McMurdie, P. J. & Holmes, S. P. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J. 11, 2639–2643 (2017).

    Google Scholar 

  101. Antich, A., Palacin, C., Wangensteen, O. S. & Turon, X. To denoise or to cluster, that is not the question: optimizing pipelines for COI metabarcoding and metaphylogeography. BMC Bioinformatics 22, 177 (2021).

    Google Scholar 

  102. Harris, J. D. Can you bank on GenBank? Trends Ecol. Evol. 18, 317–319 (2003).

    Google Scholar 

  103. Viard, F., Roby, C., Turon, X., Bouchemousse, S. & Bishop, J. Cryptic diversity and database errors challenge non-indigenous species surveys: an illustration with Botrylloides spp. in the english channel and Mediterranean sea. Front. Mar. Sci. 6, 615 (2019).

    Google Scholar 

  104. Eren, A. M., Vineis, J. H., Morrison, H. G. & Sogin, M. L. A filtering method to generate high quality short reads using illumina paired-end technology. PLoS One 8, e66643 (2013).

    Google Scholar 

  105. Minoche, A. E., Dohm, J. C. & Himmelbauer, H. Evaluation of genomic high-throughput sequencing data generated on illumina HiSeq and genome analyzer systems. Genome Biol. 12, R112 (2011).

    Google Scholar 

  106. Köster, J. & Rahmann, S. Snakemake–a scalable bioinformatics workflow engine. Bioinformatics 28, 2520–2522 (2012).

    Google Scholar 

  107. Shaiber, A. et al. Functional and genetic markers of niche partitioning among enigmatic members of the human oral microbiome. Genome Biol. 21, 292 (2020).

    Google Scholar 

  108. Eren, A. M. Community-led, integrated, reproducible multi-omics with anvi’o. Nat. Microbiol. 6, 3–6 (2021).

    Google Scholar 

  109. Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nat. Methods 11, 1144–1146 (2014).

    Google Scholar 

  110. Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).

    Google Scholar 

  111. Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).

    Google Scholar 

  112. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    Google Scholar 

  113. Parks, D. H. et al. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat. Biotechnol. 38, 1079–1086 (2020).

    Google Scholar 

  114. Chaumeil, P. A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics 36, 1925–1927 (2020).

    Google Scholar 

  115. Bokulich, N. A. et al. Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2′s q2-feature-classifier plugin. Microbiome 6, 90 (2018).

    Google Scholar 

  116. Chakoory, O., Comtet-Marre, S. & Peyret, P. RiboTaxa: combined approaches for rRNA genes taxonomic resolution down to the species level from metagenomics data revealing novelties. Nar Genom. Bioinform. 4, lqac070 (2022).

    Google Scholar 

  117. Ewels, P., Magnusson, M., Lundin, S. & Käller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048 (2016).

    Google Scholar 

  118. Xue, Y. X., Lanzén, A. & Jonassen, I. Reconstructing ribosomal genes from large scale total RNA meta-transcriptomic data. Bioinformatics 36, 3365–3371 (2020).

    Google Scholar 

  119. Miller, C. S., Baker, B. J., Thomas, B. C., Singer, S. W. & Banfield, J. F. EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data. Genome Biol. 12, R44 (2011).

    Google Scholar 

  120. Kopylova, E., Noé, L. & Touzet, H. SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics 28, 3211–3217 (2012).

    Google Scholar 

  121. Rognes, T., Flouri, T., Nichols, B., Quince, C. & Mahe, F. VSEARCH: a versatile open source tool for metagenomics. PeerJ 4, e2584 (2016).

    Google Scholar 

  122. Lu, J. N. & Salzberg, S. L. Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2. Microbiome 8, 124 (2020).

    Google Scholar 

  123. Wood, D. E. & Salzberg, S. L. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 15, R46 (2014).

    Google Scholar 

  124. Peng, Y., Leung, H. C. M., Yiu, S. M. & Chin, F. Y. L. IDBA-UD: a assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 1420–1428 (2012).

    Google Scholar 

  125. Huang, X. Q. & Madan, A. CAP3: a DNA sequence assembly program. Genome Res. 9, 868–877 (1999).

    Google Scholar 

Download references

Acknowledgements

We express special gratitude to the scientific direction and scientific committee of the “Pourquoi Pas les Abysses?” project and to the board of the French Oceanographic Fleet for allowing unusual use of boat time (including transits) through the AMIGO series and to the mission chiefs of all the crews who kindly sampled for the project. This work was supported by Ifremer during the development of prototypes and protocols in “Pourquoi Pas les Abysses?” and by Genoscope, the Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA) and France Génomique (ANR-10-INBS-09) for high-throughput sequencing in eDNAbyss (AP2016–228 France Génomique). We also thank the HADES-ERC Advanced grant (#669947) and the EU Atlas project (678760) and benefited from State aid managed by the National Research Agency under France 2030 for the LIFEDEEPER project (ANR-22-POCE-0007) and the ANR Cerberus (ANR-17-CE02-0003) for samples gathered during the associated cruises and the project MarEEE (MUSE, Montpellier, ANR-16-IDEX-0006) for the improvement of the original bioinformatic pipeline. We warmly acknowledge all the crews, mission chiefs and colleagues who contributed gathering this widespread sampling collection: Covadonga Orejas, Martin Ludvigsen and Eva Ramirez-Llodra, Jean-Paul Justiniano, Yves Fouquet and Ewan Pelleter, Ewen Raugel, Wayne Crawford, Cécile Guieu, Sophie Bonnet, Sophie Arnaud-Haond, François Bonhomme, Pierre-Marie Sarradin, Carlos Duarte, Franck Wenzhoefer, Mathilde Cannat, Norbert Franck, Marie-Anne Cambon, Stéphane Hourdez and Didier Jollivet. We would like gratefully acknowledge the entire Genoscope technical team: Julie Batisse, Odette Beluche, Isabelle Bordelais, Elodie Brun, Maria Dubois, Corinne Dumont, Zineb El Hajji, Barbara Estrada, Thomas Guérin, Chadia Hamon, Sandrine Lebled, Patricia Lenoble and Marine Lepretre, Claudine Louesse, Ghislaine Magdelenat, Eric Mahieu, Claire Milani, Sophie Oztas, Emilie Payen, Emmanuelle Petit, Muriel Ronsin and Benoît Vacherie, for their invaluable work in producing the data. We thank the editorial team and the referees for the improvements suggested to previous versions of this manuscript.

Author information

Authors and Affiliations

Authors

Consortia

Contributions

S.A.H., C.B., J.P., S.C.M. and F.P. wrote the manuscript with the help of F.V., S.H. and M.M. All coauthors reviewed the manuscript. S.A.H., F.P., J.S., C.dV., J.P. and P.W. conceptualized the project. S.A.H. and P.W. obtained funding and administrated the project. B.T., C.L.H., J.A., M.I.B., M.C., S.F., V.C.G., D.J., A.S.L., F.P., J.S., P.M.S., C.S., M.C., A.T.L., S.V., F.B., D.Z., O.U. and J.P., as well as all mission chiefs, contributed to the collection of the environmental samples. B.T., C.L.H., K.A., J.A., M.I.B., F.C., V.C.G., B.G., C.F., S.F., F.L., E.O., G.T.T. and S.A.H. performed the DNA extractions. J.P., C.B., M.I.B. and C.L.H. developed the amplicon sequencing protocol, and S.C.M. and P.P. developed the C.B.H. protocol. J.P., K.L., F.G., P.H.O. and all the Genoscope technical teams were involved in the library preparations and sequencing tasks for metagenomics and metabarcoding, and data curation, S.C.M. and PP for the libraries and sequencing for C.B.H. C.B. and J.M.A. developed Data validation and visualization softwares. S.A.H., B.T., M.V., J.M.A., J.P. and C.B. contributed to Validation. M.I.B., A.C.J., B.G., B.T., L.M., P.D., S.A.H., N.H., K.A., F.V., S.C.M. and P.P. developed the bioinformatics pipelines and/or performed the data analysis. S.P., C.B., S.A.H. and S.V. provided the metadata and data. C.B.H., S.G., J.G., G.S., E.K.J., S.P., S.C.M. and P.D. managed the data to be transmitted to a public repository.

Corresponding authors

Correspondence to
Sophie Arnaud-Haond or Julie Poulain.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Arnaud-Haond, S., Trouche, B., Liautard-Haag, C. et al. Omics exploration of deep-sea biodiversity: data from the “Pourquoi Pas les Abysses?” and eDNAbyss projects.
Sci Data (2025). https://doi.org/10.1038/s41597-025-06009-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41597-025-06009-1


Source: Ecology - nature.com

Molecular signatures and machine learning driven stress biomarkers for rainbow trout aquaculture and climate adaptation

Quality evaluation of Capitatae Fructus from different geographical regions in China

Back to Top