Hug, L. A. et al. A new view of the tree of life. Nat. Microbiol. 1, 16048 (2016).Article
PubMed
CAS
Google Scholar
Spang, A. et al. Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179 (2015).Article
PubMed
PubMed Central
CAS
Google Scholar
Tyson, G. W. et al. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428, 37–43 (2004).Article
PubMed
CAS
Google Scholar
Anantharaman, K. et al. Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system. Nat. Commun. 7, 13219 (2016).Article
PubMed
PubMed Central
CAS
Google Scholar
Parks, D. H. et al. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat. Microbiol. 2, 1533–1542 (2017).Article
PubMed
CAS
Google Scholar
Tully, B. J. & Graham, E. D. & Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. Sci. Data 5, 170203 (2018).Article
PubMed
PubMed Central
CAS
Google Scholar
Stewart, R. D. et al. Assembly of 913 microbial genomes from metagenomic sequencing of the cow rumen. Nat. Commun. 9, 870 (2018).Article
PubMed
PubMed Central
Google Scholar
Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography and lifestyle. Cell 176, 649–662 (2019).Article
PubMed
PubMed Central
CAS
Google Scholar
Nayfach, S. et al. A genomic catalog of Earth’s microbiomes. Nat. Biotechnol. 39, 499–509, https://doi.org/10.1038/s41587-020-0718-6 (2021).Article
PubMed
CAS
Google Scholar
Gilbert, J. A., Jansson, J. K. & Knight, R. The Earth Microbiome project: successes and aspirations. BMC Biol 12, 69 (2014).Article
PubMed
PubMed Central
Google Scholar
Saheb Kashaf, S., Almeida, A., Segre, J. A. & Finn, R. D. Recovering prokaryotic genomes from host-associated, short-read shotgun metagenomic sequencing data. Nat. Protoc. 16, 2520–2541 (2021).Article
PubMed
CAS
Google Scholar
Chong, J., Liu, P., Zhou, G. & Xia, J. Using MicrobiomeAnalyst for comprehensive statistical, functional, and meta-analysis of microbiome data. Nat. Protoc. 15, 799–821 (2020).Article
PubMed
CAS
Google Scholar
Arkin, A. P. et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nat. Biotechnol. 36, 566–569 (2018).Article
PubMed
PubMed Central
CAS
Google Scholar
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 49, D10–D17 (2021).Article
PubMed
CAS
Google Scholar
Kluyver, T., et al. Jupyter Notebooks – a publishing format for reproducible computational workflows. In: Loizides F, Schmidt B, editors. Positioning and Power in Academic Publishing: Players, Agents and Agendas. p. 87–90 (2016).Banfield, J. Development of a Knowledgebase to Integrate, Analyze, Distribute, and Visualize Microbial Community Systems Biology Data. (2015). Report number: DOE-UCB-4918, OSTI ID: 1167269.Chen, I.-M. A. et al. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res 47, D666–D677 (2019).Article
PubMed
CAS
Google Scholar
Afgan, E. et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res 44, W3–W10 (2016).Article
PubMed
PubMed Central
CAS
Google Scholar
Devisetty, U. K., Kennedy, K., Sarando, P., Merchant, N. & Lyons, E. Bringing your tools to CyVerse discovery environment using Docker. F1000Res. 5, 1442 (2016).Article
PubMed
PubMed Central
Google Scholar
Wang, L., Lu, Z., Van Buren, P. & Ware, D. SciApps: a bioinformatics workflow platform powered by XSEDE and CyVerse. in Proceedings of the Practice and Experience on Advanced Research Computing 1–5 (Association for Computing Machinery, 2018).Eren, A. M. et al. Community-led, integrated, reproducible multi-omics with anvi’o. Nat. Microbiol. 6, 3–6 (2021).Article
PubMed
PubMed Central
CAS
Google Scholar
Wattam, A. R. et al. Improvements to PATRIC, the all-bacterial bioinformatics database and analysis resource center. Nucleic Acids Res 45, D535–D542 (2017).Article
PubMed
CAS
Google Scholar
Mitchell, A. L. et al. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res. 48, D570–D578 (2020).PubMed
CAS
Google Scholar
Wu, Y.-W. et al. Ionic liquids impact the bioenergy feedstock-degrading microbiome and transcription of enzymes relevant to polysaccharide hydrolysis. mSystems 1, e00120–16 (2016).Article
PubMed
PubMed Central
Google Scholar
Rajeev, L. et al. Dynamic cyanobacterial response to hydration and dehydration in a desert biological soil crust. ISME J 7, 2178–2191 (2013).Article
PubMed
PubMed Central
CAS
Google Scholar
Foster, I. Globus Online: accelerating and democratizing science through cloud-based services. IEEE Internet Comput 15, 70–73 (2011).Article
Google Scholar
Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res 27, 824–834 (2017).Article
PubMed
PubMed Central
CAS
Google Scholar
Zhang, H. et al. dbCAN2: a meta server for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 46, W95–W101 (2018).Article
PubMed
PubMed Central
CAS
Google Scholar
Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2019).PubMed Central
Google Scholar
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinforma 10, 421 (2009).Article
Google Scholar
Nordberg, H. et al. The genome portal of the Department of Energy Joint Genome Institute: 2014 updates. Nucleic Acids Res 42, D26–D31 (2014).Article
PubMed
CAS
Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).Article
PubMed
PubMed Central
CAS
Google Scholar
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10–12 (2011).Article
Google Scholar
Menzel, P., Ng, K. L. & Krogh, A. Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nat. Commun. 7, 11257 (2016).Article
PubMed
PubMed Central
CAS
Google Scholar
Freitas, T. A. K., Li, P.-E., Scholz, M. B. & Chain, P. S. G. Accurate read-based metagenome characterization using a hierarchical suite of unique signatures. Nucleic Acids Res 43, e69 (2015).Article
PubMed
PubMed Central
Google Scholar
Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol 20, 257 (2019).Article
PubMed
PubMed Central
CAS
Google Scholar
Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902–903 (2015).Article
PubMed
CAS
Google Scholar
Milanese, A. et al. Microbial abundance, activity and population genomic profiling with mOTUs2. Nat. Commun. 10, 2014 (2019).Article
Google Scholar
Youngblut, N. D. & Ley, R. E. Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets. Peer J 9, e12198 (2021).Article
PubMed
PubMed Central
Google Scholar
Ondov, B. D., Bergman, N. H. & Phillippy, A. M. Interactive metagenomic visualization in a Web browser. BMC Bioinform 12, 385 (2011).Article
Google Scholar
Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).Article
PubMed
CAS
Google Scholar
Peng, Y., Leung, H. C. M., Yiu, S. M. & Chin, F. Y. L. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 1420–1428 (2012).Article
PubMed
CAS
Google Scholar
Orakov, A. et al. GUNC: detection of chimerism and contamination in prokaryotic genomes. Genome Biol 22, 178 (2021).Article
PubMed
PubMed Central
CAS
Google Scholar
Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).Article
PubMed
PubMed Central
CAS
Google Scholar
Wu, Y.-W., Simmons, B. A. & Singer, S. W. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605–607 (2016).Article
PubMed
CAS
Google Scholar
Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).Article
PubMed
PubMed Central
Google Scholar
Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nat. Methods 11, 1144–1146 (2014).Article
PubMed
CAS
Google Scholar
Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).Article
PubMed
PubMed Central
CAS
Google Scholar
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25, 1043–1055 (2015).Article
PubMed
PubMed Central
CAS
Google Scholar
Delcher, A. L., Salzberg, S. L. & Phillippy, A. M. Using MUMmer to identify similar regions in large sequence sets. Curr. Protoc. Bioinform. Chapter 10, Unit 10.3 (2003).
Google Scholar
Darling, A. C. E., Mau, B., Blattner, F. R. & Perna, N. T. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14, 1394–1403 (2004).Article
PubMed
PubMed Central
CAS
Google Scholar
Parks, D. H. et al. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res 50, D785–D794 (2022).Article
PubMed
CAS
Google Scholar
Bowers, R. M. et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat. Biotechnol. 35, 725–731 (2017).Article
PubMed
PubMed Central
CAS
Google Scholar
Brettin, T. et al. RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci. Rep. 5, 8365 (2015).Article
PubMed
PubMed Central
Google Scholar
Overbeek, R. et al. The SEED and the rapid annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res 42, D206–D214 (2014).Article
PubMed
CAS
Google Scholar
Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).Article
PubMed
CAS
Google Scholar
Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinform 11, 119 (2010).Article
Google Scholar
Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).Article
PubMed
CAS
Google Scholar
Rinke, C. et al. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat. Microbiol. 6, 946–959 (2021).Article
PubMed
CAS
Google Scholar
Haft, D. H. et al. RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res 46, D851–D860 (2018).Article
PubMed
CAS
Google Scholar
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).Article
PubMed
PubMed Central
Google Scholar
Shaffer, M. et al. DRAM for distilling microbial metabolism to automate the curation of microbiome function. Nucleic Acids Res 48, 8883–8900 (2020).Article
PubMed
PubMed Central
CAS
Google Scholar
Galperin, M. Y., Makarova, K. S., Wolf, Y. I. & Koonin, E. V. Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res 43, D261–D269 (2015). (Database Issue).Article
PubMed
CAS
Google Scholar
El-Gebali, S. et al. The Pfam protein families database in 2019. Nucleic Acids Res 47, D427–D432 (2019).Article
PubMed
CAS
Google Scholar
Haft, D. H. et al. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res 41, D387–D395 (2013). (Database issue).Article
PubMed
CAS
Google Scholar
Eddy, S. R. Accelerated Profile HMM Searches. PLoS Comput. Biol. 7, e1002195 (2011).Article
PubMed
PubMed Central
CAS
Google Scholar
Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P. M. & Henrissat, B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res 42, D490–D495 (2014).Article
PubMed
CAS
Google Scholar
Chivian, D., Dehal, P. S., Keller, K. & Arkin, A. P. MetaMicrobesOnline: phylogenomic analysis of microbial communities. Nucleic Acids Res 41, D648–D654 (2013).Article
PubMed
CAS
Google Scholar
Karaoz, U. & Brodie, E. L. microTrait: a toolset for a trait-based representation of microbial genomes. Front. Bioinform. https://doi.org/10.3389/fbinf.2022.918853 (2022).Article
PubMed
PubMed Central
Google Scholar
Wood-Charlson, E. M. et al. The National Microbiome Data Collaborative: enabling microbiome science. Nat. Rev. Microbiol. 18, 313–314 (2020).Article
PubMed
CAS
Google Scholar
Hofmeyr, S. et al. Terabase-scale metagenome coassembly with MetaHipMer. Sci. Rep. 10, 10689 (2020).Article
PubMed
PubMed Central
CAS
Google Scholar
Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).Article
PubMed
CAS
Google Scholar
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27, 722–736 (2017).Article
PubMed
PubMed Central
CAS
Google Scholar
Bertrand, D. et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat. Biotechnol. 37, 937–944 (2019).Article
PubMed
CAS
Google Scholar
Chen, L.-X. et al. Accurate and complete genomes from metagenomes. Genome Res 30, 315–333 (2020).Article
PubMed
PubMed Central
CAS
Google Scholar
Lui, L. M., Nielsen, T. N. & Arkin, A. P. A method for achieving complete microbial genomes and improving bins from metagenomics data. PLoS Comput Biol 17, e1008972 (2021).Article
PubMed
PubMed Central
CAS
Google Scholar
Miller, C. S., Baker, B. J., Thomas, B. C., Singer, S. W. & Banfield, J. F. EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data. Genome Biol 12, R44 (2011).Article
PubMed
PubMed Central
CAS
Google Scholar
Chivian, D. et al. Genome extraction from shotgun metagenome sequence data. KBase n/33233/628 https://doi.org/10.25982/33233.606/1831502 (2022).Article
Google Scholar
Chivian, D., et al. Moab desert crust – sample 4E. KBase n/62384/334 (2022). https://doi.org/10.25982/62384.253/1831503Jain, C., Rodriguez-R, L. M., Phillippy, A. M., Konstantinidis, K. T. & Aluru, S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 9, 5114 (2018).Article
PubMed
PubMed Central
Google Scholar
Matsen, F. A., Kodner, R. B. & Armbrust, E. V. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinform 11, 538 (2010).Article
Google Scholar
Benson, D. A. et al. GenBank. Nucleic Acids Res 46, D41–D47 (2018).Article
PubMed
CAS
Google Scholar
Ewing, B. & Green, P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194 (1998).Article
PubMed
CAS
Google Scholar
Teiling, C. BaseSpace: Simplifying metagenomic analysis. 26th European Congress of Clinical Microbiology and Infectious Diseases (2016) 10.26226/morressier.56d5ba2ed462b80296c9509dReich, M. et al. The GenePattern notebook environment. Cell Syst 5, 149–151.e1 (2017).Article
PubMed
PubMed Central
CAS
Google Scholar
Uritskiy, G. V., DiRuggiero, J. & Taylor, J. MetaWRAP-a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6, 158 (2018).Article
PubMed
PubMed Central
Google Scholar
Karp, P. D. et al. A comparison of microbial genome web portals. Front. Microbiol. 10, 208 (2019).Article
PubMed
PubMed Central
Google Scholar
Yue, Y. et al. Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets. BMC Bioinform 21, 334 (2020).Article
CAS
Google Scholar
Nelson, W. C., Tully, B. J. & Mobberley, J. M. Biases in genome reconstruction from metagenomic data. PeerJ 8, e10119 (2020).Article
PubMed
PubMed Central
Google Scholar
Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J 11, 2864–2868 (2017).Article
PubMed
PubMed Central
CAS
Google Scholar
Li, L., Stoeckert, C. J. Jr & Roos, D. S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13, 2178–2189 (2003).Article
PubMed
PubMed Central
CAS
Google Scholar
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32, 1792–1797 (2004).Article
PubMed
PubMed Central
CAS
Google Scholar
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).Article
PubMed
PubMed Central
CAS
Google Scholar
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).Article
PubMed
PubMed Central
CAS
Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550 (2014).Article
PubMed
PubMed Central
Google Scholar
Kumari, S. et al. A KBase case study on genome-wide transcriptomics and plant primary metabolism in response to drought stress in sorghum. Curr. Plant Biol. 28, 100229 (2021).Article
CAS
Google Scholar
Seaver, S. M. D. et al. The ModelSEED biochemistry database for the integration of metabolic annotations and the reconstruction, comparison and analysis of metabolic models for plants, fungi and microbes. Nucleic Acids Res 49, D575–D588 (2021).Article
PubMed
CAS
Google Scholar
Schloss, P. D. et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75, 7537–7541 (2009).Article
PubMed
PubMed Central
CAS
Google Scholar
Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).Article
PubMed
PubMed Central
CAS
Google Scholar More