More stories

  • in

    Impacts of the US southeast wood pellet industry on local forest carbon stocks

    European Commission Directorate General for Research and Innovation. A sustainable Bioeconomy for Europe: Strengthening the Connection Between Economy, Society and the Environment: Updated Bioeconomy Strategy (Directorate General for Research and Innovation, 2018).
    Google Scholar 
    Teitelbaum, L., Boldt, C. & Patermann, C. Global Bioeconomy Policy Report (IV): A Decade of Bioeconomy policy (International Advisory Council on Global Bioeconomy, 2020).
    Google Scholar 
    European Parliament; European Council. Directive (EU) 2018/2001 of the European Parliament and of the Council of 11 December 2018 on the promotion of the use of energy from renewable sources (2018). (Online). http://data.europa.eu/eli/dir/2018/2001/oj.European Parliament; European Council. Directive 2009/28/EC on the Promotion of the Use of Energy from Renewable Sources (2009). (Online). http://data.europa.eu/eli/dir/2009/28/oj.Glasenapp, S., & McCusker, A. Wood energy data: the joint wood, in Wood Energy in the ECE Region: Data, Trends and Outlook in Europe, the Commonwealth of Independent States and North America, Geneva, United Nations’ Economic Commission for Europe: ECE/TIM/SP/42, 12–29 (2018).Eurostat. Wood Products—Production and Trade (2021). (Online). https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Wood_products_-_production_and_trade#Wood-based_industries. Accessed 10 9 2021.Food and Agriculture Organization of the United Nations. FAOSTAT: Forestry Production and Trade (2021). (Online). http://www.fao.org/faostat/en/#data. Accessed 13 September 2021.The Intergovernmental Panel on Climate Change. Refinement to the 2006 IPCC Guidelines for National Greenhouse Gas Inventories (PCC Task Force on National Greenhouse Gas Inventories, 2019).
    Google Scholar 
    European Parliament; European Council. Commission Delegated Regulation (EU) 2019/807 of 13 March 2019 Supplementing Directive (EU) 2018/2001 of the European Parliament and of the Council as Regards the Determination of High Indirect Land-Use Change-Risk (2018) (Online). fttps://eur-lex.europa.eu/eli/reg_del/2019/807/oj.de Oliveira Garcia, W., Amann, T. & Hartmann, J. Increasing biomass demand enlarges negative forest nutrient budget areas in wood export regions. Sci. Rep. 8, 5280 (2018).ADS 
    PubMed 
    PubMed Central 

    Google Scholar 
    Searchinger, T. et al. Europe’s renewable energy directive poised to harm global forests. Nat. Commun. 9, 3741 (2018).ADS 
    PubMed 
    PubMed Central 

    Google Scholar 
    Galik, C. S. & Abt, R. C. Sustainability guidelines and forest market response: An assessment of European Union pellet demand in the southeastern United States. GCB Bioenergy 8, 658–669 (2016).
    Google Scholar 
    Favero, A. D. & Sohngen, B. Forests: Carbon sequestration, biomass energy, or both?. Sci. Adv. 6(13), eaay6792 (2020).ADS 
    PubMed 
    PubMed Central 

    Google Scholar 
    Cowie, A. et al. Applying a science-based systems perspective to dispel misconceptions about climate effects of forest bioenergy. GCB-Bioenergy 13, 1210–1231 (2021).
    Google Scholar 
    Camia, A, Jonsson, G. J. R., Robert, N., Cazzaniga, N., Jasinevičius, G., Avitabile, V., Grassi, G., Barredo, J., & Mubareka, S. The Use of Woody Biomass for Energy Production in the EU (European Commission, Joint Research Center, 2021).Aguilar, F. X., Mirzaee, A., McGarvey, R., Shifley, S. & Burtraw, D. Expansion of US wood pellet industry points to positive trends but the need for continued monitoring. Sci. Rep. 10, 18607 (2020).ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    Dale, V., Parish, E., Kline, K. & Tobin, E. How is wood-based pellet production affecting forest conditions in the southeastern United States?. For Ecol Manag 396, 143–14 (2017).
    Google Scholar 
    Ceccherini, G. et al. Abrupt increase in harvested forest area over Europe after 2015. Nature 583, 72–77 (2020).ADS 
    CAS 
    PubMed 

    Google Scholar 
    FORISK Consulting. U.S. Wood Bioenergy Database (2020). (Online). https://forisk.com/. Accessed 2020.Domke, G. et al. Toward inventory-based estimates of soil organic carbon in forests of the United States. Ecol. Appl. 27(4), 1223–1235 (2017).CAS 
    PubMed 

    Google Scholar 
    Python Org. Python Programming Language (2022) (Online). https://www.python.org/. Accessed 1 January 2018.STATA. Stata: statistical software for data science (2022) (Online). https://www.stata.com/. Accessed 1 January 2018.QGIS. Free and Open Source Geographic Information System (2021). (Online). https://qgis.org/en/site/.US Department of Agriculture, Forest Service. Forest Inventory and Analysis National Program (2020). (Online). https://www.fia.fs.fed.us/.Burrill, E. A., Wilson, A. M., Turner, J. A., Pugh, S. A., Menlove, J., Christiansen, G., Conkling, B., & David, W. The Forest Inventory and Analysis Database: Database Description and User Guide Version 8.0 for Phase 2 (US Department of Agriculture, US Forest Service, 2018).Ahmed, M. et al. Spatially-explicit modeling of multi-scale drivers of aboveground forest biomass and water yield in watersheds of the Southeastern United States. J. Environ. Manag. 199, 158–171 (2017).
    Google Scholar 
    Timilsina, N. et al. A framework for identifying carbon hotspots and forest management drivers. J. Environ. Manag. 114, 293–302 (2012).
    Google Scholar 
    Coulston, J., Ritters, K., McRoberts, R., Reams, G. & Smith, W. True versus perturbed forest inventory plot locations for modeling: A simulation study. Can. J. For. Res. 36, 801–807 (2006).
    Google Scholar 
    Anselin, L. Spatial effects in econometric practice in environmental and resource economics. Am. J. Agric. Econ. 83(3), 705–710 (2001).MathSciNet 

    Google Scholar 
    Strange-Olesen, A., Bager, S., Kittler, B., Price, W., & Aguilar, F. Environmental Implications of Increased Reliance of the EU on Biomass from the South East US (European Commission Report ENV.B.1/ETU/2014/0043, 2015).Spelter, H., & Toth, D. North America’s Wood Pellet Sector (U.S. Department of Agriculture, Forest Service, Forest Products Laboratory, 2009).Goerndt, M., Aguilar, F. & Skog, K. Drivers of biomass co-firing in US coal-fired power plants. Biomass Bioenerg. 58, 158–167 (2013).
    Google Scholar 
    US Department of Agriculture, Forest Service. Forest Inventory and Analysis National Program: Timber Products Output Studies (2022). (Online). https://www.fia.fs.fed.us/program-features/tpo/. Accessed 2022.Sonter, L. et al. Mining drives extensive deforestation in the Brazilian Amazon. Nat. Commun. 8(1013), 66. https://doi.org/10.1038/s41467-017-00557-w (2017).CAS 

    Google Scholar 
    Mirzaee, A., McGarvey, R., Aguilar, F. & Schliep, E. Impact of biopower generation on eastern US forests. Environ. Dev. Sustain. https://doi.org/10.1007/s10668-022-02235-4 (2022).
    Google Scholar 
    Brandeis, C., Taylor, M., Abt, K., & Alderman, D. Status and Trends for the U.S. Forest Products Sector: A Technical Document Supporting the Forest Service 2020 RPA Assessment (US Department of Agriculture, Forest Service Southern Research Station, Forest Inventory and Analysis, 2021).US Environmental Protection Agency. Emissions & Generation Resource Integrated Database (eGRID) (2021) (Online). https://www.epa.gov/egrid.US Department of Transportation. Ports: ArcGIS Online (2021) (Online). https://data-usdot.opendata.arcgis.com/datasets/usdot::ports/about.US Census Bureau. TIGER/Line Shapefiles (2021) (Online). https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html.US Census Bureau. Population and Housing Units Estimates Datasets (2021) (Online). https://www.census.gov/programs-surveys/popest/data/data-sets.html.McCann, P. The Economics of Industrial Location: A Logistics-Costs Approach (Springer, 1998).Singh, D., Cubbage, F., Gonzalez, R. & Abt, R. Locational determinants for wood pellet plants: A review and case study of North and South America. BioResources 11(3), 7928–7952 (2016).
    Google Scholar 
    Boukherroub, T., LeBel, L. & Lemieux, S. An integrated wood pellet supply chain development: Selecting among feedstock sources and a range of operating scales. Appl. Energy 198, 385–400 (2017).
    Google Scholar 
    Heckman, J., Ichimura, H. & Todd, P. Matching as an econometric evaluation estimator: Evidence from evaluating a JobTraining Programme. Rev. Econ. Stud. 64(4), 605–654 (1997).MATH 

    Google Scholar 
    Caliendo, M. & Kopeinig, S. Some practical guidance for the implementation of propensity score matching. J. Econ. Surv. 22(1), 31–72 (2008).
    Google Scholar 
    Woo, H., Eskelson, B. & Monleon, V. Matching methods to quantify wildfire effects on forest carbon mass in the U.S. Pacific Northwest. Ecol. Appl. 31(3), e02283 (2021).PubMed 

    Google Scholar 
    Morreale, L., Thompson, J., Tang, X., Reinmann, A. & Hutyra, L. Elevated growth and biomass along temperate forest edges. Nat. Commun. 12(7181), 66 (2021).
    Google Scholar 
    Isard, W. The general theory of location and space-economy. Q. J. Econ. 63(4), 476–506 (1949).
    Google Scholar 
    Aguilar, F. X. Spatial econometric analysis of location drivers in a renewable resource-based industry: The U.S. South Lumber Industry. For. Policy Econ. 11(3), 184–193 (2009).
    Google Scholar 
    Aguilar, F. X. Conjoint analysis of industry location preferences: evidence from the softwood lumber industry in the US. Appl. Econ. 66, 3265–3274 (2010).
    Google Scholar 
    Aguilar, F. X., Goerndt, M., Song, N. & Shifley, S. Internal, external and location factors influencing cofiring of biomass with coal in the US northern region. Energy Econ. 34, 1790–1798 (2012).
    Google Scholar 
    Ferraro, P. J. et al. Estimating the impacts of conservation on ecosystem services and poverty by integrating modeling and evaluation. Proc. Natl. Acad. Sci. 112(24), 7420–7425 (2015).ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    Zhang, D. & Pearse, P. Forest Economics 412 (UBC Press, 2011).
    Google Scholar 
    Villalobos, L., Coria, J. & Nordén, L. Has forest certification reduced forest degradation in Sweden?. Land Econ. 94, 220–238 (2018).
    Google Scholar 
    Wooldridge, J. Econometric Analysis of Cross Section and Panel Data (MIT Press, 2010).Blackman, A., Corral, L., Lima, E. & Asner, G. Titling indigenous communities protects forests in the Peruvian Amazon. PNAS 114(16), 4123–4128 (2016).ADS 

    Google Scholar 
    Abt, K. L., Abt, R. C., Galik, C. S., & Skog, K. E. Effect of Policies on Pellet Production and Forests in the U.S. South: A Technical Document Supporting the Forest Service Update of the 2010 RPA Assessment USDA (Forest Service GTR Srs-202, 2014).Hardie, P. Parks, P. Gottleib and D. Wear, “Responsiveness of rural and urban land uses to land rent determinants in the U.S. South,” Land Economics, vol. 76, no. 4, pp. 659–673, 2000.Parish, E., Herzberger, A., Phifer, C. & Dale, V. Transatlantic wood pellet trade demonstrates telecoupled benefits. Ecol. Soc. 23(1), 28 (2018).
    Google Scholar 
    Titus, B. et al. Sustainable forest biomass: A review of current residue harvesting guidelines. Energy Sustain. Soc. 11, 66. https://doi.org/10.1186/s13705-021-00281-w (2021).
    Google Scholar 
    Jandl, R. et al. How strongly can forest management influence soil carbon sequestration?. Geoderma 137(3), 253–268 (2007).ADS 
    CAS 

    Google Scholar 
    Nave, L., Vance, E., Swanston, C. & Cepas, P. S. Harvest impacts on soil carbon storage in temperate forests. For. Ecol. Manag. 259, 857–866 (2010).
    Google Scholar 
    Mayer, M. et al. Tamm review: Influence of forest management activities on soil organic carbon stocks: A knowledge synthesis. For. Ecol. Manag. 466, 118127 (2020).
    Google Scholar 
    Berryman, E., Hatten, J., Page-Dumroese, D. S., Heckman, K. A., D’Amore, D. V., Puttere, J., & Domke, G. M. Soil carbon in Forest and Rangeland Soils of the United States Under Changing Conditions 9–31 (Springer, 2020).Nave, L. E. et al. Land use and management effects on soil carbon in US Lake States, with emphasis on forestry, fire, and reforestation. Ecol. Appl. 66, 2356 (2021).
    Google Scholar 
    Cao, B., Domke, G. M., Russell, M. B. & Walters, B. Spatial modeling of litter and soil carbon stocks on forest land in the conterminous United States. Sci. Total Environ. 654, 94–106 (2019).ADS 
    CAS 
    PubMed 

    Google Scholar 
    Coulston, J. & Wear, D. From sink to source: Regional variation in U.S. forest carbon futures. Sci. Rep. 5, 66. https://doi.org/10.1038/srep16518 (2015).
    Google Scholar 
    Röder, M., Whittaker, C. & Thornley, P. How certain are greenhouse gas reductions from bioenergy? Life cycle assessment and uncertainty analysis of wood pellet-to-electricity supply chains from forest residues. Biomass Bioenerg. 79, 50–63 (2015).
    Google Scholar 
    Hanssen, S., Duden, A., Junginger, M., Dale, D. & D. vander Hilst,. Wood pellets, what else? Greenhouse gas parity times of European electricity from wood pellets produced in the south-eastern United States using different softwood feedstocks. GC-Bioenergy 9(9), 1406–1422 (2017).CAS 

    Google Scholar 
    Picciano, P., Aguilar, F., Burtraw, D. & Mirzaee, A. Environmental and socio-economic implications of woody biomass co-firing at coal-fired power plants. Resour. Energy Econ. 6, 66 (2022).
    Google Scholar 
    Hetchner, S., Schelhas, J., & Brosius, J. Forests as Fuel: Energy, Landscape, Climate, and Race in the U.S. South (Lexington Books, 2022).Coulston, J., Wear, D. & Vose, J. Complex forest dynamics indicate potential for slowing carbon accumulation in the southeastern United States. Sci. Rep. 5, 8002 (2015).ADS 
    PubMed 
    PubMed Central 

    Google Scholar 
    Palahí, M. et al. Concerns about reported harvests in European forests. Nature 592, E15–E17 (2021).PubMed 

    Google Scholar  More

  • in

    Metagenome-assembled genome extraction and analysis from microbiomes using KBase

    Hug, L. A. et al. A new view of the tree of life. Nat. Microbiol. 1, 16048 (2016).Article 
    PubMed 
    CAS 

    Google Scholar 
    Spang, A. et al. Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179 (2015).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Tyson, G. W. et al. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428, 37–43 (2004).Article 
    PubMed 
    CAS 

    Google Scholar 
    Anantharaman, K. et al. Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system. Nat. Commun. 7, 13219 (2016).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Parks, D. H. et al. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat. Microbiol. 2, 1533–1542 (2017).Article 
    PubMed 
    CAS 

    Google Scholar 
    Tully, B. J. & Graham, E. D. & Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. Sci. Data 5, 170203 (2018).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Stewart, R. D. et al. Assembly of 913 microbial genomes from metagenomic sequencing of the cow rumen. Nat. Commun. 9, 870 (2018).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography and lifestyle. Cell 176, 649–662 (2019).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Nayfach, S. et al. A genomic catalog of Earth’s microbiomes. Nat. Biotechnol. 39, 499–509, https://doi.org/10.1038/s41587-020-0718-6 (2021).Article 
    PubMed 
    CAS 

    Google Scholar 
    Gilbert, J. A., Jansson, J. K. & Knight, R. The Earth Microbiome project: successes and aspirations. BMC Biol 12, 69 (2014).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Saheb Kashaf, S., Almeida, A., Segre, J. A. & Finn, R. D. Recovering prokaryotic genomes from host-associated, short-read shotgun metagenomic sequencing data. Nat. Protoc. 16, 2520–2541 (2021).Article 
    PubMed 
    CAS 

    Google Scholar 
    Chong, J., Liu, P., Zhou, G. & Xia, J. Using MicrobiomeAnalyst for comprehensive statistical, functional, and meta-analysis of microbiome data. Nat. Protoc. 15, 799–821 (2020).Article 
    PubMed 
    CAS 

    Google Scholar 
    Arkin, A. P. et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nat. Biotechnol. 36, 566–569 (2018).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 49, D10–D17 (2021).Article 
    PubMed 
    CAS 

    Google Scholar 
    Kluyver, T., et al. Jupyter Notebooks – a publishing format for reproducible computational workflows. In: Loizides F, Schmidt B, editors. Positioning and Power in Academic Publishing: Players, Agents and Agendas. p. 87–90 (2016).Banfield, J. Development of a Knowledgebase to Integrate, Analyze, Distribute, and Visualize Microbial Community Systems Biology Data. (2015). Report number: DOE-UCB-4918, OSTI ID: 1167269.Chen, I.-M. A. et al. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res 47, D666–D677 (2019).Article 
    PubMed 
    CAS 

    Google Scholar 
    Afgan, E. et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res 44, W3–W10 (2016).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Devisetty, U. K., Kennedy, K., Sarando, P., Merchant, N. & Lyons, E. Bringing your tools to CyVerse discovery environment using Docker. F1000Res. 5, 1442 (2016).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Wang, L., Lu, Z., Van Buren, P. & Ware, D. SciApps: a bioinformatics workflow platform powered by XSEDE and CyVerse. in Proceedings of the Practice and Experience on Advanced Research Computing 1–5 (Association for Computing Machinery, 2018).Eren, A. M. et al. Community-led, integrated, reproducible multi-omics with anvi’o. Nat. Microbiol. 6, 3–6 (2021).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Wattam, A. R. et al. Improvements to PATRIC, the all-bacterial bioinformatics database and analysis resource center. Nucleic Acids Res 45, D535–D542 (2017).Article 
    PubMed 
    CAS 

    Google Scholar 
    Mitchell, A. L. et al. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res. 48, D570–D578 (2020).PubMed 
    CAS 

    Google Scholar 
    Wu, Y.-W. et al. Ionic liquids impact the bioenergy feedstock-degrading microbiome and transcription of enzymes relevant to polysaccharide hydrolysis. mSystems 1, e00120–16 (2016).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Rajeev, L. et al. Dynamic cyanobacterial response to hydration and dehydration in a desert biological soil crust. ISME J 7, 2178–2191 (2013).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Foster, I. Globus Online: accelerating and democratizing science through cloud-based services. IEEE Internet Comput 15, 70–73 (2011).Article 

    Google Scholar 
    Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res 27, 824–834 (2017).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Zhang, H. et al. dbCAN2: a meta server for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 46, W95–W101 (2018).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2019).PubMed Central 

    Google Scholar 
    Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinforma 10, 421 (2009).Article 

    Google Scholar 
    Nordberg, H. et al. The genome portal of the Department of Energy Joint Genome Institute: 2014 updates. Nucleic Acids Res 42, D26–D31 (2014).Article 
    PubMed 
    CAS 

    Google Scholar 
    Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10–12 (2011).Article 

    Google Scholar 
    Menzel, P., Ng, K. L. & Krogh, A. Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nat. Commun. 7, 11257 (2016).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Freitas, T. A. K., Li, P.-E., Scholz, M. B. & Chain, P. S. G. Accurate read-based metagenome characterization using a hierarchical suite of unique signatures. Nucleic Acids Res 43, e69 (2015).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol 20, 257 (2019).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Truong, D. T. et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat. Methods 12, 902–903 (2015).Article 
    PubMed 
    CAS 

    Google Scholar 
    Milanese, A. et al. Microbial abundance, activity and population genomic profiling with mOTUs2. Nat. Commun. 10, 2014 (2019).Article 

    Google Scholar 
    Youngblut, N. D. & Ley, R. E. Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets. Peer J 9, e12198 (2021).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Ondov, B. D., Bergman, N. H. & Phillippy, A. M. Interactive metagenomic visualization in a Web browser. BMC Bioinform 12, 385 (2011).Article 

    Google Scholar 
    Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).Article 
    PubMed 
    CAS 

    Google Scholar 
    Peng, Y., Leung, H. C. M., Yiu, S. M. & Chin, F. Y. L. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 1420–1428 (2012).Article 
    PubMed 
    CAS 

    Google Scholar 
    Orakov, A. et al. GUNC: detection of chimerism and contamination in prokaryotic genomes. Genome Biol 22, 178 (2021).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Wu, Y.-W., Simmons, B. A. & Singer, S. W. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605–607 (2016).Article 
    PubMed 
    CAS 

    Google Scholar 
    Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nat. Methods 11, 1144–1146 (2014).Article 
    PubMed 
    CAS 

    Google Scholar 
    Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25, 1043–1055 (2015).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Delcher, A. L., Salzberg, S. L. & Phillippy, A. M. Using MUMmer to identify similar regions in large sequence sets. Curr. Protoc. Bioinform. Chapter 10, Unit 10.3 (2003).
    Google Scholar 
    Darling, A. C. E., Mau, B., Blattner, F. R. & Perna, N. T. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14, 1394–1403 (2004).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Parks, D. H. et al. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res 50, D785–D794 (2022).Article 
    PubMed 
    CAS 

    Google Scholar 
    Bowers, R. M. et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat. Biotechnol. 35, 725–731 (2017).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Brettin, T. et al. RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci. Rep. 5, 8365 (2015).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Overbeek, R. et al. The SEED and the rapid annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res 42, D206–D214 (2014).Article 
    PubMed 
    CAS 

    Google Scholar 
    Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).Article 
    PubMed 
    CAS 

    Google Scholar 
    Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinform 11, 119 (2010).Article 

    Google Scholar 
    Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).Article 
    PubMed 
    CAS 

    Google Scholar 
    Rinke, C. et al. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat. Microbiol. 6, 946–959 (2021).Article 
    PubMed 
    CAS 

    Google Scholar 
    Haft, D. H. et al. RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res 46, D851–D860 (2018).Article 
    PubMed 
    CAS 

    Google Scholar 
    Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Shaffer, M. et al. DRAM for distilling microbial metabolism to automate the curation of microbiome function. Nucleic Acids Res 48, 8883–8900 (2020).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Galperin, M. Y., Makarova, K. S., Wolf, Y. I. & Koonin, E. V. Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res 43, D261–D269 (2015). (Database Issue).Article 
    PubMed 
    CAS 

    Google Scholar 
    El-Gebali, S. et al. The Pfam protein families database in 2019. Nucleic Acids Res 47, D427–D432 (2019).Article 
    PubMed 
    CAS 

    Google Scholar 
    Haft, D. H. et al. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res 41, D387–D395 (2013). (Database issue).Article 
    PubMed 
    CAS 

    Google Scholar 
    Eddy, S. R. Accelerated Profile HMM Searches. PLoS Comput. Biol. 7, e1002195 (2011).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P. M. & Henrissat, B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res 42, D490–D495 (2014).Article 
    PubMed 
    CAS 

    Google Scholar 
    Chivian, D., Dehal, P. S., Keller, K. & Arkin, A. P. MetaMicrobesOnline: phylogenomic analysis of microbial communities. Nucleic Acids Res 41, D648–D654 (2013).Article 
    PubMed 
    CAS 

    Google Scholar 
    Karaoz, U. & Brodie, E. L. microTrait: a toolset for a trait-based representation of microbial genomes. Front. Bioinform. https://doi.org/10.3389/fbinf.2022.918853 (2022).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Wood-Charlson, E. M. et al. The National Microbiome Data Collaborative: enabling microbiome science. Nat. Rev. Microbiol. 18, 313–314 (2020).Article 
    PubMed 
    CAS 

    Google Scholar 
    Hofmeyr, S. et al. Terabase-scale metagenome coassembly with MetaHipMer. Sci. Rep. 10, 10689 (2020).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).Article 
    PubMed 
    CAS 

    Google Scholar 
    Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27, 722–736 (2017).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Bertrand, D. et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat. Biotechnol. 37, 937–944 (2019).Article 
    PubMed 
    CAS 

    Google Scholar 
    Chen, L.-X. et al. Accurate and complete genomes from metagenomes. Genome Res 30, 315–333 (2020).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Lui, L. M., Nielsen, T. N. & Arkin, A. P. A method for achieving complete microbial genomes and improving bins from metagenomics data. PLoS Comput Biol 17, e1008972 (2021).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Miller, C. S., Baker, B. J., Thomas, B. C., Singer, S. W. & Banfield, J. F. EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data. Genome Biol 12, R44 (2011).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Chivian, D. et al. Genome extraction from shotgun metagenome sequence data. KBase n/33233/628 https://doi.org/10.25982/33233.606/1831502 (2022).Article 

    Google Scholar 
    Chivian, D., et al. Moab desert crust – sample 4E. KBase n/62384/334 (2022). https://doi.org/10.25982/62384.253/1831503Jain, C., Rodriguez-R, L. M., Phillippy, A. M., Konstantinidis, K. T. & Aluru, S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 9, 5114 (2018).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Matsen, F. A., Kodner, R. B. & Armbrust, E. V. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinform 11, 538 (2010).Article 

    Google Scholar 
    Benson, D. A. et al. GenBank. Nucleic Acids Res 46, D41–D47 (2018).Article 
    PubMed 
    CAS 

    Google Scholar 
    Ewing, B. & Green, P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194 (1998).Article 
    PubMed 
    CAS 

    Google Scholar 
    Teiling, C. BaseSpace: Simplifying metagenomic analysis. 26th European Congress of Clinical Microbiology and Infectious Diseases (2016) 10.26226/morressier.56d5ba2ed462b80296c9509dReich, M. et al. The GenePattern notebook environment. Cell Syst 5, 149–151.e1 (2017).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Uritskiy, G. V., DiRuggiero, J. & Taylor, J. MetaWRAP-a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6, 158 (2018).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Karp, P. D. et al. A comparison of microbial genome web portals. Front. Microbiol. 10, 208 (2019).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Yue, Y. et al. Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets. BMC Bioinform 21, 334 (2020).Article 
    CAS 

    Google Scholar 
    Nelson, W. C., Tully, B. J. & Mobberley, J. M. Biases in genome reconstruction from metagenomic data. PeerJ 8, e10119 (2020).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Olm, M. R., Brown, C. T., Brooks, B. & Banfield, J. F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J 11, 2864–2868 (2017).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Li, L., Stoeckert, C. J. Jr & Roos, D. S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13, 2178–2189 (2003).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32, 1792–1797 (2004).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550 (2014).Article 
    PubMed 
    PubMed Central 

    Google Scholar 
    Kumari, S. et al. A KBase case study on genome-wide transcriptomics and plant primary metabolism in response to drought stress in sorghum. Curr. Plant Biol. 28, 100229 (2021).Article 
    CAS 

    Google Scholar 
    Seaver, S. M. D. et al. The ModelSEED biochemistry database for the integration of metabolic annotations and the reconstruction, comparison and analysis of metabolic models for plants, fungi and microbes. Nucleic Acids Res 49, D575–D588 (2021).Article 
    PubMed 
    CAS 

    Google Scholar 
    Schloss, P. D. et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75, 7537–7541 (2009).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar 
    Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar  More

  • in

    Scale matters in service supply

    Balvanera, P. et al. Bioscience 64, 49–57 (2014).Article 

    Google Scholar 
    Hooper, D. U. et al. Ecol. Monogr. 75, 3–35 (2005).Article 

    Google Scholar 
    Balvanera, P. et al. Ecol. Lett. 9, 1146–1156 (2006).Article 
    PubMed 

    Google Scholar 
    Cardinale, B. J. et al. Am. J. Bot. 98, 572–592 (2011).Article 
    PubMed 

    Google Scholar 
    Cardinale, B. J. B. J. et al. Nature 486, 59–67 (2012).Article 
    CAS 
    PubMed 

    Google Scholar 
    Manning, P. et al. in Advances in Ecological Research (eds Eisenhauer N. et al.) 323–356 (Academic, 2019).Le Provost, G. et al. Nat. Ecol. Evol. https://doi.org/10.1038/s41559-022-01918-5 (2022).Felipe-Lucia, M. R. et al. Proc. Natl Acad. Sci. USA 117, 28140–28149 (2020).Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    Foley, J. A. et al. Science 309, 570–574 (2005).Article 
    CAS 
    PubMed 

    Google Scholar 
    Cardinale, B. J. et al. Ecology 94, 1697–1707 (2013).Article 
    PubMed 

    Google Scholar 
    Teles da Mota, V. & Pickering, C. J. Outdoor Recreat. Tour. 30, 100295 (2020).Article 

    Google Scholar 
    Mitchell, M. G. E. et al. Trends Ecol. Evol. 30, 190–198 (2015).Article 
    PubMed 

    Google Scholar 
    Raudsepp-Hearne, C. & Peterson, G. D. Ecol. Soc. 21, 16 (2016).Article 

    Google Scholar 
    Chaplin-Kramer, R. & Kremen, C. Ecol. Appl. 22, 1936–1948 (2012).Article 
    PubMed 

    Google Scholar  More

  • in

    Switch to perennial rice promotes sustainable farming

    Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.This is a summary of: Zhang, S. et al. Sustained productivity and agronomic potential of perennial rice. Nat. Sustain. https://doi.org/10.1038/s41893-022-00997-3 (2022). More

  • in

    Global and regional ecological boundaries explain abrupt spatial discontinuities in avian frugivory interactions

    Dataset acquisitionPlant-frugivore network data were obtained through different online sources and publications (Supplementary Table 1). Only networks that met the following criteria were retrieved: (i) the network contains quantitative data (a measure of interaction frequency) from a location, pooling through time if necessary; (ii) the network includes avian frugivores. Importantly, we removed non-avian frugivores from our analyses because only 28 out of 196 raw networks (before data cleaning) sampled non-avian frugivores, and not removing non-avian frugivores would generate spurious apparent turnover between networks that did vs. did not sample those taxa. In addition, the removal of non-avian frugivores did not strongly decrease the number of frugivores in our dataset (Supplementary Fig. 20a) or the total number of links in the global network of frugivory (Supplementary Fig. 20b). Furthermore, non-avian frugivores, as well as their interactions, were not shared across ecoregions and biomes (Supplementary Fig. 21), so their inclusion would only strengthen the results we found (though as noted above, we believe that this would be spurious because they are not as well sampled); (iii) the network (after removal of non-avian frugivores) contains greater than two species in each trophic level. Because this size threshold was somewhat arbitrary, we used a sensitivity analysis to assess the effect of our network size threshold on the reported patterns (see Sensitivity analysis section in the Supplementary Methods and Supplementary Figs. 22–24); and (iv) network sampling was not taxonomically restricted, that is, sampling was not focused on a specific taxonomic group, such as a given plant or bird family. Note, however, that authors often select focal plants or frugivorous birds to be sampled, but this was not considered as a taxonomic restriction if plants and birds were not selected based on their taxonomy (e.g., focal plants were selected based on the availability of fruits at the time of sampling, or focal birds were selected based on previous studies of bird diet in the study site). The first source for network data was the Web of Life database42, which contains 33 georeferenced plant-frugivore networks from 28 published studies, of which 12 networks met our criteria.We also accessed the Scopus database on 04 May 2020 using the following keyword combination: (“plant-frugivore*” OR “plant-bird*” OR “frugivorous bird*” OR “avian frugivore*” OR “seed dispers*”) AND (“network*” OR “web*”) to search for papers that include data on avian frugivory networks. The search returned a total of 532 studies, from which 62 networks that met the above criteria were retrieved. We also contacted authors to obtain plant-frugivore networks that were not publicly available, which provided us a further 110 networks. The remaining networks (N = 12) were obtained by checking the database from a recently published study12. In total, 196 quantitative avian frugivory networks were used in our analyses.Generating the distance matrices to serve as predictor and response variablesEcoregion and biome distancesWe used the most up-to-date (2017) map of ecoregions and biomes3, which divides the globe into 846 terrestrial ecoregions nested within 14 biomes, to generate our ecoregion and biome distance matrices. Of these, 67 ecoregions and 11 biomes are represented in our dataset (Supplementary Figs. 1 and 2). We constructed two alternative versions of both the ecoregion and biome distance matrices. In the first, binary version, if two ecological networks were from localities within the same ecoregion/biome, a dissimilarity of zero was given to this pair of networks, whereas a dissimilarity of one was given to a pair of networks from distinct ecoregions/biomes (this is the same as calculating the Euclidean distance on a presence–absence matrix with networks in rows and ecoregion/biomes in columns).In the second, quantitative version, we estimated the pairwise environmental dissimilarity between our ecoregions and biomes using six environmental variables recently demonstrated to be relevant in predicting ecoregion distinctness, namely mean annual temperature, temperature seasonality, mean annual rainfall, rainfall seasonality, slope and human footprint38. We obtained climatic and elevation data from WorldClim 2.143 at a spatial resolution of 1-km2. We transformed the elevation raster into a slope raster using the terrain function from the raster package44 in R45. As a measure of human disturbance, we used human footprint—a metric that combines eight variables associated with human disturbances of the environment: the extent of built environments, crop land, pasture land, human population density, night-time lights, railways, roads and navigable waterways26. The human footprint raster was downloaded at a 1-km2 resolution26. Because human footprint data were not available for one of our ecoregions (Galápagos Islands xeric scrub), we estimated human footprint for this ecoregion by converting visually interpreted scores into the human footprint index. We did this by analyzing satellite images of the region and following a visual score criterion26. Given the previously demonstrated strong agreement between visual score and human footprint values26, we fitted a linear model using the visual score and human footprint data from 676 validation plots located within the Deserts and xeric shrublands biome – the biome in which the Galápagos Islands xeric scrub ecoregion is located – and estimated the human footprint values for our own visual scores using the predict function in R45.We used 1-km2 resolution rasters and the extract function from the raster package44 to calculate the mean value of each of our six environmental variables for each ecoregion in our dataset. Because biomes are considerably larger than ecoregions (which makes obtaining environmental data for biomes more computationally expensive) we used a coarser spatial resolution of 5-km2 for calculating the mean values of environmental variables for each biome. Since a 5-km2 resolution raster was not available for human footprint, we transformed the 1-km2 resolution raster into a 5-km2 raster using the resample function from the same package.To combine these six environmental variables into quantitative matrices of ecoregion and biome environmental dissimilarity, we ran a Principal Component Analysis (PCA) on our scaled multivariate data matrix (where rows are ecoregions or biomes and columns are environmental variables). From this PCA, we selected the scores of the four and three principal components, which represented 89.6% and 88.7% of the variance for ecoregions and biomes, respectively, and converted it into a distance matrix by calculating the Euclidean distance between pairs of ecoregions/biomes using the vegdist function from the vegan package46. Finally, we transformed the ecoregion or biome distance matrix into a N × N matrix where N is the number of local networks. In this matrix, cell values represent the pairwise environmental dissimilarity between the ecoregions/biomes where the networks are located. The main advantage of using this quantitative approach is that, instead of simply evaluating whether avian frugivory networks located in distinct ecoregions or biomes are different from each other in terms of network composition and structure (as in our binary approach), we were also able to determine whether the extent of network dissimilarity depended on how environmentally different the ecoregions or biomes are from one another.Local-scale human disturbance distanceTo generate our local human disturbance distance matrix, we extracted human footprint data at a 1-km2 spatial resolution26 and calculated the mean human footprint values within a 5-km buffer zone around each network site. For the networks located within the Galápagos Islands xeric scrub ecoregion (N = 4), we estimated the human footprint index using the same method described in the previous section for ecoregion- or biome-scale human footprint. We then calculated the pairwise Euclidean distance between human footprint values from our network sites. Thus, low cell values in the local human disturbance distance matrix indicate pairs of network sites with a similar level of human disturbance, while high values represent pairs of network sites with very different levels of human disturbance.Spatial distanceThe spatial distance matrix was generated using the Haversine (i.e., great circle) distance between all pairwise combinations of network coordinates. In this matrix, cell values represent the geographical distance between network sites.Elevational differenceWe calculated the Euclidean distance between pairwise elevation values (estimated as meters above sea level) of network sites to generate our elevational difference matrix. Elevation values were obtained from the original sources when available or using Google Earth47. In the elevational difference matrix, low cell values represent pairs of network sites within similar elevations, whereas high values represent pairs of network sites within very different elevations.Network sampling dissimilarityWe used the metadata retrieved from each of our 196 local networks to generate our network sampling dissimilarity matrices, which aim to control statistically for differences in network sampling. There are many ways in which sampling effort could be quantified, so we began by calculating a variety of metrics, then narrowed our options by assessing which of these was most related to network metrics. We divided the sampling metrics into two categories: time span-related metrics (i.e., sampling hours and months) and empirical metrics of sampling completeness (i.e., sampling completeness and sampling intensity), which aim to account for how complete network sampling was in terms of species interactions (Supplementary Table 2).We selected the quantitative sampling metrics to be included in our models based on (i) the fit of generalized linear models evaluating the relationship between number of sampling hours and sampling months of the study and network-level metrics (i.e., bird richness, plant richness and number of links), and (ii) how well time span-related metrics, sampling completeness and sampling intensity predicted the proportion of known interactions that were sampled in each local network (hereafter, ratio of interactions) for a subset of the data. This latter metric, defined as the ratio between the number of interactions in the local network and the number of known possible interactions in the region involving the species in the local network, captures raw sampling completeness. Therefore, ratio of interactions estimates, for a given set of species, the proportion of all their interactions known for a region that are found to occur among those same species in the local network. To calculate this metric, we needed high-resolution information on the possible interactions, so we used a subset of 14 networks sampled in Aotearoa New Zealand, since there is an extensive compilation of frugivory events recorded for this country48. After this process, we selected number of sampling hours, number of sampling months and sampling intensity for inclusion in our statistical models (Supplementary Figs. 7 and 8; Supplementary Table 2). We generated the corresponding distance matrices by calculating the Euclidean distance between metric values. Similarly, we generated a Euclidean distance matrix for differences in sampling year between pairs of networks, which aims to account for long-term changes in the environment, species composition and network sampling methods. We obtained the sampling year of our local networks from the original sources and calculated the mean sampling year value for those networks sampled across multiple years.Because sampling methods, such as sampling design, focus (i.e., focal taxa, which determines whether a zoocentric or phytocentric method was used), interaction frequency type (i.e., how interaction frequency was measured) and coverage (total or partial) might also affect the observed plant-frugivore interactions49, we combined these variables into a single distance matrix to estimate the overall differences in sampling methods between networks. Because most of these variables were categorical with multiple levels (Supplementary Table 3), we generated our method’s dissimilarity matrix by using a generalization of Gower’s distance method50, which allows the treatment of different types of variables when calculating distances. For this, we used the dist.ktab function from the ade4 package51. We ran a Principal Coordinates Analysis (PCoA) on this distance matrix, selected the first four axes, which explained 81.2% of the variation in method’s dissimilarity, and calculated the Euclidean distance between pairs of networks using the vegdist function from the vegan package46 in R45.Network dissimilarityWe generated three network dissimilarity matrices to be our response variables in the statistical models. In the first, cell values represent the pairwise dissimilarity in species composition between networks (beta diversity of species; βS)27. Second, we measured interaction dissimilarity (beta diversity of interactions; βWN), which represents the pairwise dissimilarity in the identity of interactions between networks27. Importantly, we did not include interaction rewiring (βOS) in our main analysis because this metric can only be calculated for networks that share interaction partners (i.e., it estimates whether shared species interact differently)27, which limited the number and the spatial distribution of networks available for analysis (but see the Rewiring analysis section for an analysis on the subset of our dataset for which this was possible). Metrics were calculated using the network_betadiversity function from the betalink package52 in R45.Finally, we calculated a third dissimilarity matrix to capture overall differences in network structure. We recognize that there are many potential metrics of network structure, and that many of these are strongly correlated with one another53,54,55,56. We therefore chose a range of metrics that captured the number of links, their relative weightings (including across trophic levels), and their arrangement among species, then combined these into a single distance matrix. Specifically, we quantified network structural dissimilarity using the following metrics: weighted connectance, weighted nestedness, interaction evenness, PDI and modularity.Weighted connectance represents the number of links relative to the number of possible links, weighted by the frequency of each interaction55, and is therefore a measure of network-level specialization (higher values of weighted connectance indicate lower specialization). Importantly, it has been suggested that connectance affects persistence in mutualistic systems54. We measured nestedness (i.e., the pattern in which specialist species interact with proper subsets of the species that generalist species interact with) using the weighted version of nestedness based on overlap and decreasing fill (wNODF)57. Notably, nested structures have been commonly reported in plant-frugivore networks33. Interaction evenness is Shannon’s evenness index applied for species interactions and represents how evenly distributed the interactions are in the network21,58. This metric has been previously demonstrated to decline with habitat modification as a consequence of some interactions being favored over others in high-disturbance environments21. PDI (Paired Difference Index) is a measure of species-level specialization on resources and a reliable indicator not only of specialization, but also of absolute generalism59. Thus, this metric contributes to understanding of the ecological processes that drive the prevalence of specialists or generalists in ecological networks59. In order to obtain a network-level PDI, we calculated the weighted mean PDI for each local network. Finally, we calculated modularity (i.e., the level of compartmentalization within networks) using the DIRTPLAwb+ algorithm60. Modularity estimates the extent to which species within modules interact more with each other than with species from other modules61, and it has been demonstrated to affect the persistence and resilience of mutualistic networks54. All the selected network metrics are based on weighted (quantitative) interaction data, as these have been suggested to be less biased by sampling incompleteness62 and to better reflect environmental changes21. All network metrics were calculated using the bipartite package63 in R45.We ran a Principal Component Analysis (PCA) on our scaled multivariate data matrix (N × M where N is the number of local networks in our dataset and M is the number of network metrics), selected the scores of the three principal components, which represented 89.9% of the variance in network metrics, and converted it into a network structural dissimilarity matrix by calculating the Euclidean distance between networks. In this distance matrix, cell values represent differences in the overall architecture of networks (over all the network metrics calculated), and therefore provide a complementary approach for evaluating how species interaction patterns vary across large-scale environmental gradients.Statistical analysisWe employed a two-tailed statistical test that combines Generalized Additive Models (GAM)29 and Multiple Regression on distance Matrices (MRM)30 to evaluate the effect of each of our predictor distance matrices on our response matrix. With this approach, we were able to fit GAMs where the predictor and responsible variables are distance matrices, while accounting for the non-independence of distances from each local network by permuting the response matrix30. The main advantage of using GAMs is their flexibility in modeling non-linear relationships through smooth functions, which are represented by a sum of simpler, fixed basis functions that determine their complexity29. Using GAM-based MRM models allowed us to obtain F values for each of the smooth terms (i.e., smooth functions of the predictor variables in our model), and test statistical significance at the level of individual variables. The binary versions of ecoregion and biome distance matrices (with two levels, “same” or “distinct”) were treated as categorical variables in the models, and t values were used for determining statistical significance. We fitted GAMs with thin plate regression splines64 using the gam function from the mgcv package29 in R45. Smoothing parameters were estimated using restricted maximum likelihood (REML)29. Our GAM-based MRM models were calculated using a modified version of the MRM function from the ecodist package65, which allowed us to combine GAMs with the permutation approach from the original MRM function (see Code availability). All the models were performed with 1000 permutations (i.e., shuffling) of the response matrix.We explored the unique and shared contributions of our predictor variables to network dissimilarity using deviance partitioning analyses. These were performed by fitting reduced models (i.e., GAMs where one or more predictor variables of interest were removed) using the same smoothing parameters as in the full model and comparing the explained deviance. We fixed smoothing parameters for comparisons in this way because these parameters tend to vary substantially (to compensate) if one of two correlated predictors is dropped from a GAM.Assessing the influence of individual studies on the reported patternsBecause our dataset comprises 196 local frugivory networks obtained from 93 different studies, and some of these studies contained multiple networks, we needed to evaluate whether our results were strongly biased by individual studies. To do this, we followed the approach from a previous study66 and tested whether F values of smooth terms and t values of categorical variables (binary version of ecoregion and biome distances) changed significantly when jackknifing across studies. We did this by dropping one study from the dataset and re-fitting the models, and then repeating this same process for all the studies in our dataset.We found a number of consistent patterns within different subsets of the data (Supplementary Figs. 15 and 16); however, some of the patterns we observed appear to be driven by individual studies with multiple networks, and hence are less representative. For instance, the study with the greatest number of networks in our dataset (study ID = 76), which contains 35 plant-frugivore networks sampled across an elevation gradient in Mt. Kilimanjaro, Tanzania67, had an overall high influence on the results when compared with the other studies. By re-running our GAM-based MRM models after removing this study from our dataset, we found that the effect of biome boundaries on interaction dissimilarity is no longer significant, whereas the effects of ecoregion boundaries, human disturbance distance, spatial distance and elevational differences remained consistent with those from the full dataset (Supplementary Table 33). Nevertheless, all the results were qualitatively similar to those obtained for the entire dataset when using network structural dissimilarity as the response variable (Supplementary Table 34).Rewiring analysisInteraction rewiring (βOS) estimates the extent to which shared species interact differently27. Because this metric can only be calculated for networks that share species from both trophic levels, we selected a subset of network pairs that shared plants and frugivorous birds (N = 1314) to test whether interaction rewiring increases across large-scale environmental gradients. Importantly, since not all possible combinations of network pairs contained values of interaction rewiring (i.e., not all pairs of networks shared species), a pairwise distance matrix could not be generated for this metric. Thus, we were not able to use the same statistical approach used in our main analysis, which is based on distance matrices (see Statistical analysis section). Instead, we performed a Generalized Additive Mixed-effects Model (GAMM) using ecoregion, biome, human disturbance, spatial, elevational, and sampling-related distance metrics as fixed effects and network IDs as random effects (to account for the non-independence of distances) (Supplementary Table 35). We also performed a reduced model with only ecoregion and biome distance metrics as predictor variables (Supplementary Table 36). The binary version of ecoregion and biome distance metrics (with two levels, “same” or “distinct”) were used as categorical variables in both models. Interaction rewiring (βOS) was calculated using the network_betadiversity function from the betalink package52 in R45. Although it has been recently argued that this metric may overestimate the importance of rewiring for network dissimilarity68, our main focus was not the partitioning of network dissimilarity into species turnover and rewiring components, but rather simply detecting whether the sub-web of shared species interacted differently. In this case, βOS (as developed by ref. 27) is an adequate and useful metric68. We fitted our models using the gamm4 function from the gamm4 package69 in R45. Smoothing parameters were estimated using restricted maximum likelihood (REML)29.Reporting summaryFurther information on research design is available in the Nature Research Reporting Summary linked to this article. More

  • in

    Towards process-oriented management of tropical reefs in the anthropocene

    McCauley, D. J. et al. Marine defaunation: animal loss in the global ocean. Science 347, 1255641 (2015).Article 

    Google Scholar 
    Hoegh-Guldberg, O., Poloczanska, E. S., Skirving, W. & Dove, S. Coral reef ecosystems under climate change and ocean acidification. Front. Mar. Sci. 4, 158 (2017).Article 

    Google Scholar 
    Ceballos, G., Ehrlich, P. R. & Raven, P. H. Vertebrates on the brink as indicators of biological annihilation and the sixth mass extinction. Proc. Natl Acad. Sci. USA 117, 13596–13602 (2020).Article 
    CAS 

    Google Scholar 
    Brandl, S. J. et al. Extreme environmental conditions reduce coral reef fish biodiversity and productivity. Nat. Commun. 11, 3832 (2020).Article 
    CAS 

    Google Scholar 
    Hughes, T. P. et al. Coral reefs in the Anthropocene. Nature 546, 82–90 (2017).Article 
    CAS 

    Google Scholar 
    Woodhead, A. J., Hicks, C. C., Norström, A. V., Williams, G. J. & Graham, N. A. J. Coral reef ecosystem services in the Anthropocene. Funct. Ecol. https://doi.org/10.1111/1365-2435.13331 (2019).Pereira, P. H. C. et al. Effectiveness of management zones for recovering parrotfish species within the largest coastal marine protected area in Brazil. Sci. Rep. 12, 12232 (2022).Article 
    CAS 

    Google Scholar 
    Campbell, S. J. et al. Fishing restrictions and remoteness deliver conservation outcomes for Indonesia’s coral reef fisheries. Conserv. Lett 13, e12698 (2020).Article 

    Google Scholar 
    Cinner, J. E. et al. Gravity of human impacts mediates coral reef conservation gains. Proc. Natl Acad. Sci. USA 115, E6116–E6125 (2018).Article 
    CAS 

    Google Scholar 
    Edgar, G. J. et al. Global conservation outcomes depend on marine protected areas with five key features. Nature 506, 216–220 (2014).Article 
    CAS 

    Google Scholar 
    Mumby, P. J., Steneck, R. S., Roff, G. & Paul, V. J. Marine reserves, fisheries ban, and 20 years of positive change in a coral reef ecosystem. Conserv. Biol. 35, 1473–1483 (2021).Article 

    Google Scholar 
    Harrison, H. B. et al. Larval export from marine reserves and the recruitment benefit for fish and fisheries. Curr. Biol. 22, 1023–1028 (2012).Article 
    CAS 

    Google Scholar 
    Kerwath, S. E., Winker, H., Götz, A. & Attwood, C. G. Marine protected area improves yield without disadvantaging fishers. Nat. Commun. 4, 2347 (2013).Article 

    Google Scholar 
    Di Lorenzo, M., Guidetti, P., Di Franco, A., Calò, A. & Claudet, J. Assessing spillover from marine protected areas and its drivers: a meta‐analytical approach. Fish Fish. 21, 906–915 (2020).Article 

    Google Scholar 
    Ban, N. C. et al. Well-being outcomes of marine protected areas. Nat. Sustain. 2, 524–532 (2019).Article 

    Google Scholar 
    Cinner, J. E. et al. Winners and losers in marine conservation: fishers’ displacement and livelihood benefits from marine reserves. Soc. Nat. Resour. 27, 994–1005 (2014).Article 

    Google Scholar 
    Gurney, G. G. et al. Biodiversity needs every tool in the box: use OECMs. Nature 595, 646–649 (2021).Article 
    CAS 

    Google Scholar 
    Smallhorn-West, P. F. et al. Hidden benefits and risks of partial protection for coral reef fisheries. Ecol. Soc. 27, art26 (2022).Article 

    Google Scholar 
    Turnbull, J. W., Johnston, E. L. & Clark, G. F. Evaluating the social and ecological effectiveness of partially protected marine areas. Conserv. Biol. 35, 921–932 (2021).Article 

    Google Scholar 
    Sala, E. et al. Protecting the global ocean for biodiversity, food and climate. Nature 592, 397–402 (2021).Article 
    CAS 

    Google Scholar 
    Cinner, J. E. et al. Meeting fisheries, ecosystem function, and biodiversity goals in a human-dominated world. Science 368, 307–311 (2020).Article 
    CAS 

    Google Scholar 
    McShane, T. O. et al. Hard choices: making trade-offs between biodiversity conservation and human well-being. Biol. Conserv. 144, 966–972 (2011).Article 

    Google Scholar 
    MacNeil, M. A. et al. Recovery potential of the world’s coral reef fishes. Nature 520, 341–344 (2015).Article 
    CAS 

    Google Scholar 
    McClanahan, T. R. et al. Critical thresholds and tangible targets for ecosystem-based management of coral reef fisheries. Proc. Natl Acad. Sci. USA 108, 17230–17233 (2011).Article 
    CAS 

    Google Scholar 
    Morais, R. A. & Bellwood, D. R. Principles for estimating fish productivity on coral reefs. Coral Reefs 39, 1221–1231 (2020).Article 

    Google Scholar 
    Lindeman, R. L. The trophic-dynamic aspect of ecology. Ecology 23, 399–417 (1942).Article 

    Google Scholar 
    Pauly, D. & Froese, R. MSY needs no epitaph—but it was abused. ICES J. Mar. Sci. 78, 2204–2210 (2021).Article 

    Google Scholar 
    Rindorf, A. et al. Strength and consistency of density dependence in marine fish productivity. Fish Fish. 23, 812–828 (2022).Article 

    Google Scholar 
    Morais, R. A., Connolly, S. R. & Bellwood, D. R. Human exploitation shapes productivity–biomass relationships on coral reefs. Glob. Change Biol. 26, 1295–1305 (2020).Article 

    Google Scholar 
    Kolding, J., Bundy, A., van Zwieten, P. A. M. & Plank, M. J. Fisheries, the inverted food pyramid. ICES J. Mar. Sci. 73, 1697–1713 (2016).Article 

    Google Scholar 
    Morais, R. A. et al. Severe coral loss shifts energetic dynamics on a coral reef. Funct. Ecol. 34, 1507–1518 (2020).Article 

    Google Scholar 
    Sala, E. & Giakoumi, S. No-take marine reserves are the most effective protected areas in the ocean. ICES J. Mar. Sci. 75, 1166–1168 (2018).Article 

    Google Scholar 
    Edgar, G. J. & Stuart-Smith, R. D. Systematic global assessment of reef fish communities by the Reef Life Survey program. Sci. Data 1, 140007 (2014).Article 

    Google Scholar 
    Parravicini, V. et al. Global patterns and predictors of tropical reef fish species richness. Ecography 36, 1254–1262 (2013).Article 

    Google Scholar 
    Morais, R. A. & Bellwood, D. R. Global drivers of reef fish growth. Fish Fish. 19, 874–889 (2018).Article 

    Google Scholar 
    Gislason, H., Daan, N., Rice, J. C. & Pope, J. G. Size, growth, temperature and the natural mortality of marine fish: natural mortality and size. Fish Fish. 11, 149–158 (2010).Article 

    Google Scholar 
    Graham, N. A. J. et al. Human disruption of coral reef trophic structure. Curr. Biol. 27, 231–236 (2017).Article 
    CAS 

    Google Scholar 
    Froese, R. & Pauly, D. (eds.). FishBase. Version 06/2022. https://www.fishbase.org (2022).Cochrane, K. L. Reconciling sustainability, economic efficiency and equity in marine fisheries: has there been progress in the last 20 years? Fish Fish. 22, 298–323 (2021).Article 

    Google Scholar 
    Morais, R. A., Siqueira, A. C., Smallhorn-West, P. F. & Bellwood, D. R. Spatial subsidies drive sweet spots of tropical marine biomass production. PLoS Biol. 19, e3001435 (2021).Article 
    CAS 

    Google Scholar 
    Hamilton, M. et al. Climate impacts alter fisheries productivity and turnover on coral reefs. Coral Reefs https://doi.org/10.1007/s00338-022-02265-4 (2022).Cooke, R. et al. Anthropogenic disruptions to longstanding patterns of trophic-size structure in vertebrates. Nat Ecol Evol. 6, 684–692 (2022).Article 

    Google Scholar 
    Eddy, T. D. et al. Energy flow through marine ecosystems: confronting transfer efficiency. Trends Ecol. Evol. 36, 76–86 (2021).Article 

    Google Scholar 
    Devillers, R. et al. Reinventing residual reserves in the sea: are we favouring ease of establishment over need for protection? Aquat. Conserv. Mar. Freshw. Ecosyst. 25, 480–504 (2015).Article 

    Google Scholar 
    Fontoura, L. et al. Protecting connectivity promotes successful biodiversity and fisheries conservation. Science 375, 336–340 (2022).Article 
    CAS 

    Google Scholar 
    Gill, D. A. et al. Capacity shortfalls hinder the performance of marine protected areas globally. Nature 543, 665–669 (2017).Article 
    CAS 

    Google Scholar 
    Agardy, T., di Sciara, G. N. & Christie, P. Mind the gap: addressing the shortcomings of marine protected areas through large scale marine spatial planning. Mar. Policy 35, 226–232 (2011).Article 

    Google Scholar 
    Robinson, J. P. W. et al. Habitat and fishing control grazing potential on coral reefs. Funct. Ecol. 34, 240–251 (2020).Article 

    Google Scholar 
    Robinson, J. P. W. et al. Productive instability of coral reef fisheries after climate-driven regime shifts. Nat. Ecol. Evol. 3, 183–190 (2019).Article 

    Google Scholar 
    Dudley, N. et al. The essential role of other effective area-based conservation measures in achieving big bold conservation targets. Glob. Ecol. Conserv. 15, e00424 (2018).Article 

    Google Scholar 
    Zupan, M. et al. How good is your marine protected area at curbing threats? Biol. Conserv. 221, 237–245 (2018).Article 

    Google Scholar 
    Pollnac, R. et al. Marine reserves as linked social–ecological systems. Proc. Natl Acad. Sci. USA 107, 18262–18265 (2010).Article 
    CAS 

    Google Scholar 
    McClanahan, T. R., Marnane, M. J., Cinner, J. E. & Kiene, W. E. A comparison of marine protected areas and alternative approaches to coral-reef management. Curr. Biol. 16, 1408–1413 (2006).Article 
    CAS 

    Google Scholar 
    Smallhorn-West, P. F., Weeks, R., Gurney, G. & Pressey, R. L. Ecological and socioeconomic impacts of marine protected areas in the South Pacific: assessing the evidence base. Biodivers. Conserv. 29, 349–380 (2020).Article 

    Google Scholar 
    Cinner, J. E. et al. Sixteen years of social and ecological dynamics reveal challenges and opportunities for adaptive management in sustaining the commons. Proc. Natl Acad. Sci. USA 116, 26474–26483 (2019).Article 
    CAS 

    Google Scholar 
    Wilson, S. K. et al. Habitat degradation and fishing effects on the size structure of coral reef fish communities. Ecol. Appl. 20, 442–451 (2010).Article 
    CAS 

    Google Scholar 
    Nash, K. L. & Graham, N. A. J. Ecological indicators for coral reef fisheries management. Fish Fish. 17, 1029–1054 (2016).Article 

    Google Scholar 
    Brandl, S. J., Goatley, C. H. R., Bellwood, D. R. & Tornabene, L. The hidden half: ecology and evolution of cryptobenthic fishes on coral reefs. Biol. Rev. 93, 1846–1873 (2018).Article 

    Google Scholar 
    Willis, T. J. Visual census methods underestimate density and diversity of cryptic reef fishes. J. Fish. Biol. 59, 1408–1411 (2001).Article 

    Google Scholar 
    Allen, K. R. Relation between production and biomass. J. Fish. Res. Board Can. 28, 1573–1581 (1971).Article 

    Google Scholar 
    Leigh, E. G. On the relation between the productivity, biomass, diversity, and stability of a community. Proc. Natl Acad. Sci. USA 53, 777–783 (1965).Article 
    CAS 

    Google Scholar 
    R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2020).Cinner, J. E., Daw, T. & McClanahan, T. R. Socioeconomic factors that affect artisanal fishers’ readiness to exit a declining fishery. Conserv. Biol. 23, 124–130 (2009).Article 
    CAS 

    Google Scholar 
    Cinner, J. E. et al. Linking social and ecological systems to sustain coral reef fisheries. Curr. Biol. 19, 206–212 (2009).Article 
    CAS 

    Google Scholar 
    Hicks, C. C., Crowder, L. B., Graham, N. A., Kittinger, J. N. & Cornu, E. L. Social drivers forewarn of marine regime shifts. Front. Ecol. Environ. 14, 252–260 (2016).Article 

    Google Scholar 
    Espinosa-Romero, M. J., Rodriguez, L. F., Weaver, A. H., Villanueva-Aznar, C. & Torre, J. The changing role of NGOs in Mexican small-scale fisheries: from environmental conservation to multi-scale governance. Mar. Policy 50, 290–299 (2014).Article 

    Google Scholar 
    Cutler, D. R. et al. Random forests for classification in ecology. Ecology 88, 2783–2792 (2007).Article 

    Google Scholar 
    Edgar, G. J. et al. Establishing the ecological basis for conservation of shallow marine life using Reef Life Survey. Biol. Conserv. 252, 108855 (2020).Article 

    Google Scholar 
    Selig, E. R. et al. Mapping global human dependence on marine ecosystems. Conserv. Lett. 12, e12617 (2019).Article 

    Google Scholar  More

  • in

    Managing reefs for productivity

    Seguin, R. et al. Nat. Sustain. https://doi.org/10.1038/s41893-022-00981-x (2022).Article 

    Google Scholar 
    Roberts, C. M. & Polunin, N. V. C. Rev. Fish Biol. Fish. 1, 65–91 (1991).Article 

    Google Scholar 
    Cinner, J. E. et al. Soc. Nat. Resour. 27, 994–1005 (2014).Article 

    Google Scholar 
    MacNeil, M. A. et al. Nature 520, 341–344 (2015).Article 
    CAS 

    Google Scholar 
    Morais, R. A. & Bellwood, D. R. Coral Reefs 39, 1221–1231 (2020).Article 

    Google Scholar 
    Morais, R. A., Connolly, S. R. & Bellwood, D. R. Glob. Change Biol. 26, 1295–1305 (2020).Article 

    Google Scholar 
    Di Lorenzo, M. et al. Fish Fish. 21, 906–915 (2020).Article 

    Google Scholar 
    Ban, N. C. et al. Nat. Sustain. 2, 524–532 (2019).Article 

    Google Scholar 
    Rogers, A. et al. Ecology 99, 450–463 (2018).Article 

    Google Scholar 
    Robinson, J. P. W. et al. Nat. Ecol. Evol. 3, 183–190 (2019).Article 

    Google Scholar  More

  • in

    Metagenomic analysis of diarrheal stools in Kolkata, India, indicates the possibility of subclinical infection of Vibrio cholerae O1

    Sample collection and isolation of V. cholerae O1 possessing the CT geneTwenty-three patients (patient numbers 9 to 31) who were diagnosed with cholera were examined in this study. The diagnosis was confirmed by the isolation of V. cholerae O1 from the stool of each patient. The age of patients, date of hospital admission, stool sampling date, pathogen isolated and medicines administered to the patients as treatments are described in Supplementary Table S1. Twenty-one of the stool samples were collected on the first day of hospitalization, while the remaining two stool samples were collected on the second day (patient number 29) and fourth day (patient number 10) of hospitalization. All patients had not been given any antibiotics and the samples of diarrheal stool were taken during severe diarrhea.To confirm the presence of the CT gene (ctx) in these 23 isolates, we examined the presence of ctxA in these isolates by PCR. The PCR to detect ctxA was performed as reported by Keasler and Hall6. In this PCR, amplification was performed in 30 cycles. The size of the amplified ctxA fragment was 302 bp. The target fragment was amplified from each of the V. cholerae O1 isolates. This indicated that all of the V. cholerae O1 isolates from the 23 cholera patients possessed ctxA.CT production from the isolatesThe production of CT from these 23 isolates was examined by detecting secreted CT in the medium. The 23 isolates were cultured statically in AKI medium7, and the secreted CT in the culture supernatants was measured using the GM1-ganglioside enzyme-linked immunosorbent assay (ELISA) method8. The detection limit of CT by the ELISA method used is 1.0 ng ml−1. All the samples examined were found to have CT above this concentration (Fig. 1). This shows that all isolates examined are toxigenic V. cholerae O1.Figure 1Amount of cholera toxin produced by V. cholerae O1 isolated from patients with diarrhea. Twenty-three strains of V. cholerae O1 were isolated from 23 patients with diarrhea. These isolates were cultured statically in AKI medium7 at 37 °C for 24 h. After removing the cells by centrifugation, the CT in the culture supernatants was measured using a GM1-ganglioside ELISA method8. The samples indicated by blue circle are isolates obtained by bacterial culture from two patients (patient 12 and patient 18), who are focused on in this study.Full size imageAnalysis of the stool samples of patients diagnosed with cholera diseaseMetagenomic sequencing analysisThe primary objective of this metagenomic analysis is to show the proportion of V. cholerae living in the diarrhea stool. Subsequently, if the number of V. cholerae infected in the intestinal tract is small, it is required to clarify the etiological microorganisms that cause diarrhea in that patient. For this analysis, it is necessary to investigate the presence of pathogenic microorganisms other than V. cholerae in the stool. To do this, we need to analyze the gene reads obtained by metagenomic analysis with a comprehensive manner. Therefore, we planned to obtain reads with the Burrows-Wheeler Alignment tool (BWA) with default parameters, a matching software with the ability to fulfill these objectives9 (http://bio-bwa.sourceforge.net).However, we were concerned that the genes derived from organisms other than V. cholerae in the stool were counted mistakenly as genes derived from V. cholerae in the analysis using BWA. We therefore first examined genes in stool from people unrelated to cholera disease to ensure that the analysis method we planned to use in this study would correctly detect genes from V. cholerae in stool. For this analysis we used DNA sequences reported by the NIH Human Microbiome Project (https://www.hmpdacc.org/hmp/hmp/hmasm2/). The genes we have analyzed are DNA derived from feces of 20 healthy individuals (10 males and 10 females). The results are shown in Supplementary Table S2.The number of reads analyzed in this analysis varied from sample to sample. The largest number obtained after quality filtering was 60,975,797. The lowest number was 10,301,809. However, the number of reads detected as originating from V. cholerae was very small (12 reads or less) in all samples, and none of them were detected in 7 samples. This very small number shows that the analytical method used is suitable for detecting the genes from V. cholerae in these samples.Therefore, we analyzed DNA and RNA samples from prepared diarrheal stool by the method using BWA. All raw sequencing data obtained were deposited into the DDBJ Sequence Read Archive under the accession code PRJDB10675. This number can be searched not only from DDBJ but also from EMBL and GenBank.Diarrheal stools are mostly composed of liquid, and their properties are very different from those of normal stools. The origin of the nucleic acids in diarrheal stools varies from patient to patient and is not constant. One sample may contain many genes derived from human cells, while another sample may contain many genes derived from microorganisms. To clarify the nature of the reads we obtained, we determined the proportion of reads of bacterial origin to the total number of reads in the samples analyzed, and presented this proportion in order of patient age (Fig. 2a). The ratios were not consistent, indicating that the cells of eukaryotic origin and microorganisms existing in the stool of patients with diarrhea varied from person to person.Figure 2Age of patient and the ratio of the number of read detected by metagenomic sequencing analysis of their stools. The DNA in the stool samples from 23 patients who were diagnosed with cholera disease were extracted using a commercially available kit. Patient ages are listed in Supplementary Table S1. The extracted DNA were investigated by a metagenomic sequencing analysis to clarify the origin of individual DNA. The origin of the DNA sequences was assigned by mapping to a database that included human and microorganism sequences. The obtained numbers of total reads, total bacterial reads, reads originating from V. cholerae, reads originating from ctxA in each sample are shown in Supplementary Table S3 (the data from DNA sample). The age of each patient and the ratio of the number of reads from all bacteria to the total number of reads after filtering (a) and the ratio of the number of reads from V. cholerae to the number of reads from all bacteria (b) were calculated. The horizontal axis of these figures shows the age of each patient and is the same arrangement in both (a) and (b). The numbers in parentheses indicate the sample numbers. This sample number is also the patient’s number.Full size imageThis result implied that it was difficult to detect V. cholerae in a sample with a small number of read derived from bacteria. Therefore, it was unclear whether the data obtained by the analysis was suitable for the detection of V. cholerae. In order to examine whether the data shown in Fig. 2a can be used to clarify the infection status of V. cholerae, the ratio of the reads from V. cholerae to the reads of all bacteria in the sample was calculated (Fig. 2b). As a result, the reads from V. cholerae were detected even in samples with a low ratio of bacterial genes, as seen in patients 13, 25, and 29. Conversely, some patients, such as patients 10, 18, and 17, had a high proportion of bacterial genes but a low detection rate of the read from V. cholerae (Fig. 2a,b). From these results, we thought that the data obtained are useful for analyzing the infection status of V. cholerae in the intestinal tract of the examined patients. The data also showed that patient age did not affect the intestinal retention of V. cholerae.In order to more clearly illustrate the presence of V. cholerae in the diarrheal stools of the patients examined, the ratio of reads from V. cholerae to total reads for each sample which was determined in Fig. 2a was sorted in descending order. The results are shown in Fig. 3a. The ratio (percentage) in each patient is indicated by the blue bar in the figure. The numbers in parentheses after the sample number, with D as the first letter, indicate the order from lowest to highest percentage obtained. As shown in Fig. 3a, the percentage of V. cholerae that the patients carried in their stools varied from 0.003% (sample 12(D1)) to 38.337% (sample 28(D23)).Figure 3The ratios of DNA and RNA derived from V. cholerae in stool samples. The DNA and RNA in the stool samples from 23 patients who were diagnosed with cholera disease were extracted using a commercially available kit. Subsequently, the RNA samples were treated with DNase I to remove DNA from the samples. Reverse-transcribed DNA was prepared from these RNA samples using random primers and reverse transcriptase. The extracted DNA and reverse-transcribed DNA were investigated by a metagenomic sequencing analysis to clarify the origin of individual DNA and RNA. The origin of the DNA sequences was assigned by mapping to a database that included human and microorganism sequences. The obtained numbers of total reads, total bacterial reads, reads originating from V. cholerae, reads originating from ctxA in each sample are shown in Supplementary Tables S3 (the data from DNA) and S4 (the data from RNA). The percentages of reads of DNA from V. cholerae and from ctxA relative to the total reads are presented by blue bar and red bar in panel a, respectively. The percentages of reads of DNA from V. cholerae relative to the total bacterial reads are presented in panels b. Samples are arranged in ascending order of the ratio of reads from V. cholerae to the total reads in the DNA analysis. The ranking of each sample is presented by the numbers in parentheses starting with the letter D. The samples in these panels are arranged in the order of the D number. Similarly, the ratio of reads from V. cholerae to the total RNA reads and the total bacterial RNA are presented in panels c and d, respectively. The samples indicated by red circle are samples from the diarrheal stools of a patients who are focused on in this analysis.Full size imageHowever, what we want to reveal in this study is the presence of toxigenic V. cholerae producing CT. The genes presented by blue bar in Fig. 3a appear to contain the genes derived from toxigenic V. cholerae, but it cannot be concluded that they are. It is highly possible that other genes derived from such as V. cholerae not possessing ctx or bacteria having the same gene sequence as V. cholerae, are included. So, in order to examine the existence of V. cholerae possessing ctx, we examined the number of reads derived from ctxA (Supplementary Table S3). In the samples with D number 8 or more, the gene derived from ctxA was detected in all the samples except one sample (D9). The ratio of read from ctxA to the number of reads from total DNA is shown by the red bar in Fig. 3a. The ratio of the number of reads derived from ctxA to the total DNA was correlated with the ratio of the number of reads derived from the V. cholerae gene to the total DNA (Fig. 3a). From these results, it seems that most of the genes of V. cholerae detected in Fig. 3a are derived from V. cholerae possessing ctx.Furthermore, the ratio of the number of reads of V. cholerae to the number of reads derived from total bacteria which was obtained in Fig. 2a, was arranged in the order used for the array in Fig. 3a (the order of the ratio of the number of reads from V. cholerae to the number of reads from total DNA) (Fig. 3b). From this arrangement of Fig. 3b, it can be seen that the sample with a large D head number has a large proportion of V. cholerae in the bacteria. The highest value was obtained from sample 24 (D22). The sample showed that 95.917% of the bacteria was V. cholerae.On the other hand, in many samples with small D numbers, this ratio is small, but there are exceptions. For example, in samples 25 (D3), 29 (D8) and 13 (D11), the presence of V. cholerae is clear. Although not as clear as these three samples, the presence of V. cholerae in other samples such as 22 (D4), 21 (D5), 9 (D10) and 11(D12) is evident, although in small quantities (Fig. 3b). Therefore, it was considered that these patients were infected with V. cholerae. These results seem to accurately reflect the actual state of V. cholerae in the stool. Therefore, it was considered that the infection status of V. cholerae in the patient could be inferred from the obtained data.As shown in Fig. 3b, in the samples of 18 (D2), 12 (D1), 17 (D7), 10 (D9)) and 23 (D6), the ratio of the read from V. cholerae to the read from total bacteria is very low at 0.032%, 0.118%, 0.225%, 0.244% and 0.285%, respectively. It was unknown whether these patients were infected with V. cholerae and developed diarrhea due to the infection with V. cholerae. Therefore, further examination was needed to determine if these patients were infected with V. cholerae. These five samples are marked by red circles in Fig. 3a,b.Subsequently, we examined the ratio of the reads of RNA of V. cholerae to clarify the expression of the genes of V. cholerae in the intestinal lumen of these patients. RNA samples were prepared by different methods from the patient’s stool and the RNA in these samples was analyzed by metagenomic sequencing analysis. The ratio of the number of reads derived from the RNA of V. cholerae to the number of reads derived from total RNA and to the number of reads derived from total bacterial RNA in the sample was determined. The results are shown in Fig. 3c,d, respectively. Samples that had fewer reads for genes derived from V. cholerae in the previous analysis of DNA reads (Fig. 3a,b)were also indicated with a red circle in Figs. 3c,d. These samples also had low amounts of RNA read from V. cholerae. In particular, the ratio of RNA read from V. cholerae to total bacterial RNA in samples 12 (D1) and 18 (D2) was low, 0.038% and 0.236%, respectively (Supplementary Table S4, Fig. 3d). Judging from these low values, it is doubtful that these two patients, patients 12 and 18, had diarrhea due to infection with V. cholerae.Detection of ctxA by PCRSubsequently, we amplified ctxA in the DNA samples extracted from the stool samples by PCR, in order to reconfirm the presence of ctx in stool samples. The PCR was performed using the same conditions used for the detection of ctxA in the isolates as described above in the “Sample collection and isolation of V. cholerae O1 possessing the CT gene” section of the “Results”. Amplification in this PCR was also done for 30 cycles.From the results of metagenomic sequencing shown in Fig. 3, we found that the samples from patient 12 (D1) and patient 18 (D2) contained few genes derived from V. cholerae O1. The results obtained by PCR are shown in Fig. 4. The samples from the two patients, 12 (D1) and 18 (D2), are indicated by blue circle. No distinct bands corresponding to ctxA were detected in the lanes analyzed sample 12(D1). Meanwhile, a very faint band was visible in the lane where the sample from 18(D2) was analyzed. However, it often happens that small amounts of sample are mixed into adjacent lanes when adding the sample to be analyzed in agar electrophoresis. Hence, we concluded that the amount of ctxA in these two samples amplified by PCR was very low. This supports our inference that the diarrhea in these two patients was not caused by the infection with V. cholerae O1.Figure 4PCR to detect ctxA in the stool samples of diarrhea patients. DNA was extracted from the stool samples of 23 patients who were diagnosed with cholera disease. PCR to amplify ctxA in these DNA samples was performed using the specific primers ctcagacgggatttgttaggcacg and tctatctctgtagcccctattacg6, and the products were analyzed by agarose gel electrophoresis. The sample numbers are the same as the numbers shown in the footnotes of Fig. 3. Numbers beginning with D in parentheses show the order of the content of DNA from V. cholerae among these samples. The samples indicated by blue circle are samples from the diarrheal stools of patients (patients 12 and 18), who are focused on in this study. S: the size marker for gel electrophoresis; N: the negative control in which DNA was not added to the reaction mixture; P: the positive control in which DNA prepared from V. cholerae O1 N1696128 was added to the reaction mixture.Full size imageSimilarly, clear bands were not detected in samples 9(D10), 10(D9), 13(D11), 22(D4), and 25(D3). The results of metagenomic analysis of these samples showed that the number of read from V. cholerae was low and ctxA was either not detected (samples 10(D9), 22(D4) and 25(D3)) or was detected but in small amounts (samples 9(D10) and 13(D11) (Supplementary Table S3, Fig. 3a).The amount of sample added to the reaction solution in the PCR reaction was as small as 5 µl, and it is not clear whether this small volume of solution contained the necessary amount of ctxA for the amplification in PCR. It is also possible that the sample contained substances that would inhibit amplification by PCR. For these reasons, we believe that no clear band corresponding to ctxA appeared in this PCR. However, it is clear from the results of Fig. 3b,d that these samples, (9(D10), 10(D9), 13(D11), 22(D4), and 25(D3)) contain the gene derived from V. cholerae (ctx). Therefore, we considered these four patients to be patients infected with V. cholerae.The levels of CT and proteolytic activity in the stool samplesFrom the genetic studies in Figs. 3 and 4, it was inferred that V. cholerae O1 was not involved in the onset of the diarrhea in two patients (12(D1) and 18(D2)). However, this inference was based on amplification and analysis of genetic sample prepared from diarrhea stool of patients. There is no proof that the sample procurement and the analysis of sample was done reliably with high probability. Hence, we thought that it was necessary to analyze samples adjusted from different perspectives by different means.Then, we challenged to measure the amount of CT. CT is the toxin responsible for the diarrhea caused by V. cholerae O1. CT is released into the intestinal lumen, where it acts on the intestinal cells of patients to induce diarrhea. Thus, we measured the CT content in the stool samples. In addition, we also measured the proteolytic activity in the stool samples, because CT is sensitive to proteolytic activity, and we were concerned that the CT would be degraded by proteases during storage outside of the body.The CT content and the proteolytic activity in the stool samples of the 23 cholera patients were measured by the GM1-ganglioside ELISA method and the lysis of casein, respectively8,10, and the results are presented in Fig. 5a,b, respectively.Figure 5The levels of CT and proteolytic activity in the stool samples. Twenty-three stool samples of patients who were diagnosed with cholera disease were centrifuged at 10,000×g for 10 min. The CT content of the supernatants was determined using a GM1-ganglioside ELISA method8 (a). The proteolytic activity of the supernatants was determined by the lysis of casein10 (b). The sample numbers are the same as the numbers shown in the footnotes of Fig. 3. Numbers beginning with D in parentheses show the order of the content of DNA from V. cholerae among these samples. The samples in this figure are arranged in the order of the D numbers. A bar indicating the amount of CT is not drawn in the figure for the sample whose CT amount was below the detection limit. From the tests shown in Figs. 2, 3 and 4, samples of two patients who are unlikely to have diarrhea caused by the infection with V. cholerae are marked with a blue circle. O.D.: optical density.Full size imageProteolytic activity was detected in all samples, although there were differences in the strengths of the activity. It was also found that high protease activity was not associated with decreased levels of CT in the samples, e.g., sample 11(D12) showed the highest protease activity among the samples examined, and the amount of CT in that sample was also high. Therefore, we considered that the proteolytic activity had almost no influence on the amount of CT in this study. Furthermore, the fact that protease activity was found in all samples indicated that these samples were collected and stored without any significant denaturation.The ELISA method used in this assay can accurately detect CT at concentrations above 1.0 ng ml−1, but it is impossible to accurately determine the concentration of CT at concentrations below 1.0 ng ml−1. Therefore, we treated samples containing less than 1.0 ng ml−1 of CT as containing no CT.As described above, we considered that the diarrhea in the two patients (12(D1) and 18(D2)) was not due to the infection with V. cholerae O1 from the genetic analysis. The analysis of CT in stool samples showed that the CT concentrations of these two samples were below the detection limit (Fig. 5a). This indicates that the number of V. cholerae O1 in the intestinal lumen of these patients, (12(D1) and 18(D2)), was extremely low at the time of sampling.Investigation of diarrheagenic microorganisms in diarrheal stoolIt was shown that diarrhea in patients 12 (D1) and 18 (D2) may have been caused by infection with microorganisms other than V. cholerae. Then we examined the data of metagenomic sequencing of these two patients to reveal the infected diarrhea-causing microorganisms (DDBJ Sequence Read Archive under the accession code PRJDB10675). As a result, we found that that DNA from the two bacteria, Streptococcus pneumoniae and Salmonella enterica was abundant in the stools of patients 12(D1) and 18(D2), respectively.The ratios of DNA read of St. pneumoniae in DNA samples of patient 12(D1) to the total DNA and to the total bacterial DNA are 0.095% and 3.988%, respectively. These ratios of V. cholerae in this patient, 12 (D1), are 0.003% and 0.118%, respectively. And those of S. enterica in the stools of patients 18(D2) are 0.536% and 1.118%, respectively. And these ratios of V. cholerae in this patient, 18 (D2), are 0.015% and 0.032%, respectively (Supplementary Table 2).These two bacteria, St. pneumoniae and S. enterica, are bacteria that are not detected as normal intestinal bacteria. As shown, these ratios of DNA of each bacteria in diarrheal stool are much higher than these of V. cholerae. Therefore, these two bacteria are considered to be related to these patients’ symptom, respectively.Nonetheless, toxigenic V. cholerae O1 was also isolated from these two patients in laboratory bacteriology tests. It is likely that some of the very few V. cholerae O1 in the intestinal tract were extruded with the diarrhea and were subsequently detected by the enrichment culture for V. cholerae. This indicated that V. cholerae O1 may cause subclinical infections in residents of the Kolkata region of India. With this subclinical infection, the number of V. cholerae O1 inhabiting the intestinal tract might be small.Surveillance of patient samples where no diarrhea-causing microorganisms were detectedTo detect people with a subclinical infection of V. cholerae O1, we further analyzed the specific-pathogen-free stool samples of diarrhea patients. “Specific-pathogen-free stool sample” refers to the stool samples in which no etiological agent of diarrhea, including V. cholerae, was detected by our bacterial examination in the laboratory.The number of samples examined in this analysis was 22 (samples number 1001 to 1022). All 22 diarrhea patients examined were inpatients at ID hospital, Kolkata. From the 22 patients, 20 patient stool samples were collected on the 1st day of hospitalization, and the stools of the remaining two patients (patients 1004 and 2022) were collected on the 2nd day of hospitalization. Antibiotics were used in a limited manner in these patients. Ofloxacin was the only antibiotic administered, and only four patients (patients 1001, 1011, 1012, and 1021) were administered with it (Supplementary Table S1).DNA and RNA were extracted from the stool samples, and the DNA and RNA were analyzed by a metagenomic sequencing analysis using the same method used in the analysis of diarrheal stools from cholera patients.Reads of the genes from V. cholerae were detected in every sample, although the value varied from sample to sample (Supplementary Tables S5 and S6). Although reads of the genes from V. cholerae were detected in every sample, we do not believe that every stool sample examined contained V. cholerae. In the metagenomic analysis, if the base sequence of a read was common to multiple bacteria, the read was recognized as being derived from those multiple bacteria. Therefore, even if a bacterium is not present in the sample, the reads in common with other bacteria are counted as the reads of those bacteria, i.e., if a read from bacteria other than V. cholerae is homologous to a corresponding gene of V. cholerae, its detection indicates that one gene derived from V. cholerae was found in the sample. The total number of such reads is finally counted as the number of reads of V. cholerae. Therefore, it is unclear whether bacteria presenting a low read count are present in the sample. In order to solve these problems, not only the reads derived from V. cholerae but also the reads derived from ctxA were searched for in the sample.In addition, as described above, other DNA present in diarrheal stool, such as food-derived DNA, might hinder the analysis of the bacteria in the stool. As such, we determined four relative values of the number of reads from the genes of V. cholerae: the ratio of DNA reads of V. cholerae to the total DNA; the ratio of the DNA reads of V. cholerae to the total bacterial DNA; the ratio of the RNA reads of V. cholerae to the total RNA; and the ratio of the RNA reads of V. cholerae to the total bacterial RNA. Furthermore, we determined the relative value of the number of reads from ctxA to the total DNA (Supplementary Tables S5 and S6). These ratios are also shown in Fig. 6a–d.Figure 6The ratio of DNA and RNA derived from V. cholerae in stool samples of the specific-pathogen-free patients. The stool samples from 22 diarrheal patients in which no etiological agent of diarrhea, including V. cholerae, was detected by our bacterial examination in the laboratory were analyzed in this examination. The extraction of DNA and RNA, and the preparation of reverse-transcribed DNA samples from the RNA samples were performed in the same manner as in Fig. 2. The origin of the reads obtained in this analysis was assigned by mapping to a database that included human and microorganism sequences. The obtained numbers of total reads, total bacterial reads, reads originating from V. cholerae, reads from ctxA in each sample are shown in Supplementary Tables S5 (the data from DNA) and S6 (the data from RNA). The percentages of reads of DNA from V. cholerae (blue bar in a) and of reads of DNA from ctxA (red bar in a) relative to the total DNA reads, and the percentages of reads of DNA from V. cholerae relative to the total bacterial DNA reads (b) are presented. Similarly, the results obtained from the RNA samples are presented in (c) and (d). The (c) and (d) show the percentages of reads of RNA from V. cholerae relative to the total RNA reads and to the reads of total bacterial RNA, respectively. The samples indicated by green circles are the samples of interest in this manuscript, as described in the text.Full size imageThe ratios of the number of reads derived from DNA of V. cholerae and the number of reads derived from ctxA to the number of reads of total DNA genes in these samples are shown by the blue and red bars in Fig. 6a, respectively. Reads from ctxA were detected in samples 1004, 1006, 1010, 1017 and 1018. This indicates that V. cholerae possessing ctx were alive in these samples; 1004, 1006, 1010, 1017 and 1018.The ratio of V. cholerae to total bacterial DNA in these samples was examined. The results are shown in Fig. 6b. The proportion of DNA of V. cholerae to total bacteria DNA in the stool of patients 1004, 1006, 1010, 1017, and 1018 is 28.633%, 0.234%, 73.068%, 2.282%, and 2.774%, respectively (Fig. 6b).In addition, the read of RNA from V. cholerae was examined. The ratio of the RNA to total RNA and to total bacterial RNA was calculated. RNA derived from V. cholerae was reliably detected in 4 of the 5 samples (1004, 1010, 1017, 1018). The ratio of the remaining one sample (1006) were low (Fig. 6c,d). However, it has been shown that the sample (1006) contains the read from DNA of ctxA (Supplementary Table S5). Therefore, we considered these five samples to be those containing toxigenic V. cholerae.As antibiotics were not administered to these five patients, the effects of antibacterial agents could be disregarded in our examination of the bacterial species in the stools. Among these 5 samples, the ratio of samples 1004 and 1010 examined in this examination was high and comparable to those of the samples of the cholera patients (Figs. 3 and 6). We considered that the diarrhea of the patients 1004 and 1010, might have been caused by the infection with V. cholerae O1.On the other hand, the samples of patients 1006, 1017 and 1018 did not show high values that could indicate that the diarrhea was caused by the infection with V. cholerae. It is probable that the diarrhea of these three patients (1006, 1017 and 1018) was caused by the actions of factors other than V. cholerae O1, and that a small number of V. cholerae inhabits the intestinal tract as a form of subclinical infection; this would explain why a gene derived from V. cholerae was detected by the metagenomic sequencing analysis. These results support the hypothesis that subclinical infections of V. cholerae occur in Kolkata. More