More stories

  • in

    GalliForm, a database of Galliformes occurrence records from the Indo-Malay and Palaearctic, 1800–2008

    These methods are an expanded version of those in our related work, Boakes et al.15.
    The database was compiled over the period 2005–2008. Data collection equates to around 1500 person-days and data were gathered by a team of 21 people. Between them, team members were fluent in English, French, German, Mandarin, Russian, Spanish and Swedish. These languages were extremely helpful in transcribing museum specimen labels and in translating publications. However, the majority of publications were in English and we acknowledge that the database will be biased toward records published in English-language publications.
    Our study focuses on the 130 galliform species that occur within the Palaearctic and Indo-Malay biogeographic realms22 (see Online-only Table 1). We have additionally included records of the Imperial Pheasant (Lophura imperialis) although it is now recognised that this is a hybrid and not a species. The geographic range of two of the species in the database, the Red Grouse (Lagopus lagopus) and the Rock Ptarmigan (Lagopus muta), extends to North America. North American data was often included in the information which museums sent us and in these instances we entered those records into the database since we thought they might be of use to researchers studying these species. However, it should be noted that we did not search exhaustively for records of these species in North America, we have merely included those that we came across.
    We attempted to gather all species distribution data that could be accessed from five different sources; museum collections, literature records, banding (ringing) data, ornithological atlases and birdwatchers’ trip report websites. For each data source, exhaustive and systematic search strategies were adopted.
    Museum collections
    Using web-based searches and Roselaar23, 377 natural history collections were identified. We found contact details for 338 of these collections and requested by email or letter a list of the Galliformes in their holdings along with collection localities and dates. Non-respondents were recontacted. 135 museums were able to share data with us (see Online-only Table 2). Museum records were obtained through publicly available online databases e.g. ORNIS, electronic or paper catalogues sent to us by the museums or by visiting the museums and transcribing data directly from specimens or card catalogues. Almost half of the museums we contacted did not respond despite at least one follow-up enquiry, and there was substantial variation in the amount and format of data contributed by those that did reply. Altogether, over 50% of the records came from just six museums (Natural History Museum, London; Zoological Institute of the Russian Academy of Sciences, St Petersburg; Zoological Museum of Lomonosov Moscow State University; Field Museum of Natural History, Chicago; American Museum of Natural History, New York; National Museum of Natural History, Leiden), a single museum (the Natural History Museum, London) contributing nearly 20% of the museum records that could be georeferenced and dated15. Following databasing and/or georeferencing, records were returned to larger collections and to those who had requested the data.
    Literature
    Data from the literature were added to those previously collected by McGowan24. Entire series of key English-language international and regional ornithological journals such as Ibis, Bird Conservation International, Journal of the Bombay Natural History Society, and Kukila were scanned for relevant information, availability allowing. We began at the library of the Zoological Society of London and followed up missing journal issues at the BirdLife International library, Cambridge UK; the British Library, London, UK; the Edward Grey Institute, University of Oxford, UK. Relevant Chinese literature was also scanned. Additionally, data were obtained from regional reports, personal diaries, letters, newsletters etc stored in the archives of BirdLife International, Cambridge, UK; the World Pheasant Association, Newcastle, UK; the Edward Grey Institute, University of Oxford, UK. Several of the species/regional experts we consulted also contributed their personal records which were recorded in the database as ‘personal communications’. As far as it were possible, records were classed as primary or secondary data within the ‘dynamicProperties’ field of GalliForm14. It is important to note that some primary records or museum specimens will be duplicated within the database in the secondary data.
    Banding records
    Eighty-three ornithological banding groups were identified using web-based searches and were contacted via email. Thirty of these groups replied and only seven were able to provide us with data (see Table 1). The majority of galliform species tend not to be banded due to their large body sizes and spurs. Additionally, many of the banding groups kept their records on paper and were not able to send them to us. Nevertheless, we were able to access and georeference 15,152 banding records.
    Table 1 The ringing groups that shared data with GalliForm.
    Full size table

    Ornithological atlases
    We digitised location data from 20 ornithological atlases (see Table 2). Data from several other atlases were not used since the range of dates for the records was wider than 20 years.
    Table 2 The atlases that were digitised to be included in GalliForm.
    Full size table

    Trip report website data
    We used the two trip report websites that were popular with birders during the data recording period (2005–2008), www.travellingbirder.com and www.birdtours.co.uk. At that time, eBird (probably the most relevant current online source today) did not cover the majority of the countries within our study region, and our intention with the deposition of this dataset is to focus on pre-eBird data that are more difficult and time consuming to access. We extracted data from all trip reports of birdwatching visits to European, Asian and North African countries. Care was taken to enter reports that featured on both websites once only.
    Criteria for data inclusion
    To be included in the database, records had to meet the following criteria:
    1.
    The record identified the species of the bird concerned.

    2.
    The record contained either a verbal description of the locality at which the bird concerned was observed or the co-ordinates at which the bird was observed.

    Records of captive birds were excluded. Records relating to non-native occurrences were included but were flagged in the ‘establishmentMeans’ field as “introduced”.
    Data entry
    GalliForm14 was originally compiled in the programme Microsoft Access 2003. To maximise uniformity in data entry, all data recorders were given thorough and consistent training and each was provided with a set of database guidelines. An Access Database form was created to standardise data entry and to enable multiple members of the team to collect data simultaneously.
    Each entry in GalliForm14 corresponds to a single record of a single species recorded in a specific location. The data fields of GalliForm14 are described in Online-only Table 3. The taxonomy used has been updated to be consistent with the BirdLife International 2019 taxonomy (datazone.birdlife.org). All information was entered exactly as it was described in the data source, with as much information extracted as possible. Multiple records from different sources which recorded the same information were still included in the interest of completeness. The only exception to this is the trip report data in which we did not enter identical records which occurred on both the Travelling Birder and Bird Tours websites.
    The source of the data, i.e. literature, museum, atlas, ringing or website trip report is recorded in the ‘dynamicProperties’ field under the code “dataSource”. For literature data, (where known) the nature of the record, i.e. primary or secondary, is recorded under the code “datatype”.
    Taxonomy has of course changed considerably over time. To allow for this we recorded the taxonomy as it was described in the data source in the ‘originalNameUsage’ field. The current taxonomy was then selected from a look-up table. If at the time of data entry, the data compiler was unsure which species the synonym referred to, the species was tagged as “unknown” and the species was designated at a later date following further research on the synonym.
    Identical localities can also be described in multiple ways. We recorded the locality as it was given in the data source in the ‘verbatimLocality’ field. If the ‘verbatimLocality’ clearly tallied with a locality already within the database, the record was linked to that locality in order to increase georeferencing efficiency.
    It was rare for a source to record absence of evidence, i.e. a survey for a species at a particular locality which failed to find that species. However, in the few cases where we did come across such records, the locality and date of the survey were recorded and “absent” was recorded in the ‘occurrenceStatus’ field.
    Each record refers to an independent observation. For museum and ringing records, this means a single individual. For literature, atlas or trip report records this may refer to a group of birds observed in one particular locality, on one particular day. If given, the number of total individuals is recorded in the ‘individualCount’ field. The number of males and females is recorded in the ‘sex’ field and the number of juveniles and adults in the ‘lifeStage’ field. If the ‘lifeStage’ field is blank, it is reasonable to assume the individual(s) is an adult.
    Occasionally, additional information about the observation might be included in the data source, for example the habitat the bird was observed in or whether the bird was common or rare in that locality. These data are recorded in the ‘habitat’ and ‘organismQuantity’ fields, respectively. Any additional information which did not fit within the structure of the database was recorded in the ‘occurrenceRemarks’ field, along with any notes found on museum labels.
    For the purposes of data deposition, the database was converted to a tab-delimited CSV file with all fields following Darwin Core format. A full summary of these fields is given in Online-only Table 3.
    Georeferencing
    Locality descriptions were converted to geographic co-ordinates using a wide range of atlases and gazetteers, co-ordinates generally only being assigned if accurate to one degree (although in the majority of cases the locations were accurate to within 30 minutes, Table 3). We would initially search for a locality within the gazetteers available to us at the time. If the locality was not listed within those gazetteers we would search for the locality using atlases. Since this fieldwork was conducted, MaNIS standards have become widely used for studies of this kind, but these weren’t fully developed at the time of data collection25. Named places, e.g. towns or counties, were georeferenced using their geographic centre and georeferencing uncertainty measured from the centre to the edge of the named place. Often localities were given simply as the name of a river, mountain or Protected Area. In these instances we used the midpoint of the river between source and mouth (uncertainty measured as distance from midpoint to source/mouth), the summit of the mountain (uncertainty measured as distance from summit to approximate mountain foot) and the rough centre of the Protected Area (uncertainty measured as distance from centre to Protected Area edge). If a particular locality description matched two or more places their midpoint was taken (uncertainty measured as distance from midpoint to place). Offsets from localities (e.g. “50 km N of Kuala Lumpur”; “8 miles along the road from Sheffield to Chesterfield”) were measured using a digital atlas (uncertainty was approximated at the georeferencer’s discretion in these instances, usually between 3 and 10 arc-minutes, depending on the vagueness of the offset.) For georeferencing done ‘in house’, the gazeteer/atlas used was recorded.
    Table 3 Georeference and date completeness of the records.
    Full size table

    When possible, localities we could not georeference ourselves were sent to regional experts.
    92% of our localities are georeferenced to an accuracy of 30 minutes, corresponding to 82% of occurrence records (see Table 3).
    We had less success at georeferencing museum records than literature records15, due in part to difficulties in reading hand-writing on specimen labels. Older records were also harder to georeference, presumably due to changes in place names over time, and to some early ornithologists failing to document the collection locality. As might be expected, localities from countries that do not use the Roman alphabet were also harder to georeference.
    Some records were excluded from the database based on their locality: records which we thought were trading localities, notably Malacca in Malaysia and Leadenhall Market in the UK; records from captive specimens, e.g. zoological gardens.
    Dating
    49% of records are dated to within an accuracy of one year. Where possible, we assigned date ranges to undated records. For example, if the name of the collector was given on a museum specimen and we knew when that collector was active in that region, we assigned a date range covering that period. There remain undated records which could perhaps be dated in this way. Undated literature records were designated as occurring before their publication date. We were able to date 89% of records to within 10 years. More

  • in

    Author Correction: Soil carbon loss by experimental warming in a tropical forest

    Affiliations

    School of Geosciences, University of Edinburgh, Edinburgh, UK
    Andrew T. Nottingham & Patrick Meir

    Smithsonian Tropical Research Institute, Panama City, Panama
    Andrew T. Nottingham, Esther Velasquez & Benjamin L. Turner

    Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
    Patrick Meir

    Authors
    Andrew T. Nottingham

    Patrick Meir

    Esther Velasquez

    Benjamin L. Turner

    Corresponding author
    Correspondence to Andrew T. Nottingham. More

  • in

    Arresting predators

    Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript. More

  • in

    Life-history strategies of soil microbial communities in an arid ecosystem

    1.
    Fierer N. Embracing the unknown: disentangling the complexities of the soil microbiome. Nat Rev Microbiol. 2017;15:579–90.
    CAS  PubMed  Article  Google Scholar 
    2.
    Whitman WB, Coleman DC, Wiebe WJ. Prokaryotes: the unseen majority. Proc Natl Acad Sci USA. 1998;95:6578–83.
    CAS  PubMed  Article  Google Scholar 

    3.
    Bardgett RD, van der Putten WH. Belowground biodiversity and ecosystem functioning. Nature. 2014;515:505–11.
    CAS  PubMed  Article  Google Scholar 

    4.
    Green JL, Bohannan BJM, Whitaker RJ. Microbial biogeography: from taxonomy to traits. Science. 2008;320:1039–43.
    CAS  PubMed  Article  Google Scholar 

    5.
    Martiny JBH, Jones SE, Lennon JT, Martiny AC. Microbiomes in light of traits: a phylogenetic perspective. Science. 2015;350:aac9323.
    PubMed  Article  CAS  Google Scholar 

    6.
    Koch AL. Oligotrophs versus copiotrophs. BioEssays. 2001;23:657–61.
    CAS  PubMed  Article  Google Scholar 

    7.
    Fierer N, Bradford MA, Jackson RB. Toward an ecological classification of soil bacteria. Ecology. 2007;88:1354–64.
    PubMed  Article  Google Scholar 

    8.
    Ho A, Di Lonardo DP, Bodelier PLE. Revisiting life strategy concepts in environmental microbial ecology. FEMS Microbiol Ecol. 2017;93:fix006.
    Article  CAS  Google Scholar 

    9.
    Klappenbach JA, Dunbar JM, Schmidt TM. rRNA operon copy number reflects ecological strategies of bacteria. Appl Environ Microbiol. 2000;66:1328–33.
    CAS  PubMed  PubMed Central  Article  Google Scholar 

    10.
    Roller BRK, Stoddard SF, Schmidt TM. Exploiting rRNA operon copy number to investigate bacterial reproductive strategies. Nat Microbiol. 2016;1:1–7.
    Article  CAS  Google Scholar 

    11.
    Botzman M, Margalit H. Variation in global codon usage bias among prokaryotic organisms is associated with their lifestyles. Genome Biol. 2011;12:R109.
    CAS  PubMed  PubMed Central  Article  Google Scholar 

    12.
    Vieira-Silva S, Rocha EPC. The systemic imprint of growth and its uses in ecological (meta)genomics. PLoS Genet. 2010;6:e1000808.
    PubMed  PubMed Central  Article  CAS  Google Scholar 

    13.
    Pereira-Flores E, Glöckner FO, Fernandez-Guerra A. Fast and accurate average genome size and 16S rRNA gene average copy number computation in metagenomic data. BMC Bioinforma. 2019;20:453.
    Article  CAS  Google Scholar 

    14.
    Lauro FM, McDougald D, Thomas T, Williams TJ, Egan S, Rice S, et al. The genomic basis of trophic strategy in marine bacteria. Proc Natl Acad Sci USA. 2009;106:15527–33.
    CAS  PubMed  Article  Google Scholar 

    15.
    Wyman SK, Avila-Herrera A, Nayfach S, Pollard KS. A most wanted list of conserved microbial protein families with no known domains. PLoS ONE. 2018;13:e0205749.
    PubMed  PubMed Central  Article  CAS  Google Scholar 

    16.
    Galand PE, Pereira O, Hochart C, Auguet JC, Debroas D. A strong link between marine microbial community composition and function challenges the idea of functional redundancy. ISME J. 2018;12:2470–8.
    CAS  PubMed  PubMed Central  Article  Google Scholar 

    17.
    Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2016;44:D457–62.
    CAS  PubMed  Article  Google Scholar 

    18.
    Steen AD, Crits-Christoph A, Carini P, DeAngelis KM, Fierer N, Lloyd KG, et al. High proportions of bacteria and archaea across most biomes remain uncultured. ISME J. 2019;13:3126–30.
    PubMed  PubMed Central  Article  Google Scholar 

    19.
    Delgado-Baquerizo M, Oliverio AM, Brewer TE, Benavent-González A, Eldridge DJ, Bardgett RD, et al. A global atlas of the dominant bacteria found in soil. Science. 2018;359:320–5.
    CAS  PubMed  Article  Google Scholar 

    20.
    Jaroszewski L, Li Z, Krishna SS, Bakolitsa C, Wooley J, Deacon AM, et al. Exploration of uncharted regions of the protein universe. PLoS Biol. 2009;7:e1000205.
    PubMed  PubMed Central  Article  CAS  Google Scholar 

    21.
    Giovannoni S, Stingl U. The importance of culturing bacterioplankton in the ‘omics’ age. Nat Rev Microbiol. 2007;5:820–6.
    CAS  PubMed  Article  Google Scholar 

    22.
    Barberán A, Caceres Velazquez H, Jones S, Fierer N. Hiding in plain sight: Mining bacterial species records for phenotypic trait information. mSphere. 2017;2:e00237–17.
    PubMed  PubMed Central  Article  Google Scholar 

    23.
    Aguiar MR, Sala OE. Patch structure, dynamics and implications for the functioning of arid ecosystems. Trends Ecol Evol. 1999;14:273–7.
    CAS  PubMed  Article  Google Scholar 

    24.
    Schlesinger WH, Raikes JA, Hartley AE, Cross AF. On the spatial pattern of soil nutrients in desert ecosystems. Ecology. 1996;77:364–74.
    Article  Google Scholar 

    25.
    Maestre FT, Bautista S, Cortina J, Bellot J. Potential for using facilitation by grasses to establish shrubs on a semiarid degraded steppe. Ecol Appl. 2001;11:1641–55.
    Article  Google Scholar 

    26.
    Butterfield BJ, Betancourt JL, Turner RM, Briggs JM. Facilitation drives 65 years of vegetation change in the Sonoran Desert. Ecology. 2010;91:1132–9.
    PubMed  Article  Google Scholar 

    27.
    Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods. 2016;13:581–3.
    CAS  PubMed  PubMed Central  Article  Google Scholar 

    28.
    Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17:10–2.
    Article  Google Scholar 

    29.
    Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20.
    CAS  PubMed  PubMed Central  Article  Google Scholar 

    30.
    Li D, Liu CM, Luo R, Sadakane K, Lam TW. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31:1674–6.
    CAS  PubMed  Article  Google Scholar 

    31.
    Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinforma. 2010;11:119.
    Article  CAS  Google Scholar 

    32.
    Steinegger M, Söding J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol. 2017;35:1026–8.
    CAS  PubMed  Article  Google Scholar 

    33.
    Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009;25:1754–60.
    CAS  PubMed  PubMed Central  Article  Google Scholar 

    34.
    Kanehisa M, Sato Y, Morishima K. BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J Mol Biol. 2016;428:726–31.
    CAS  PubMed  Article  Google Scholar 

    35.
    Huerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ, von Mering C, et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Mol Biol Evol. 2017;34:2115–22.
    CAS  PubMed  PubMed Central  Article  Google Scholar 

    36.
    Huerta-Cepas J, Szklarczyk D, Heller D, Hernández-Plaza A, Forslund SK, Cook H, et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 2019;47:D309–14.
    CAS  PubMed  Article  Google Scholar 

    37.
    Novembre JA. Accounting for background nucleotide composition when measuring codon usage bias. Mol Biol Evol. 2002;19:1390–4.
    CAS  PubMed  Article  Google Scholar 

    38.
    Vieira-Silva S, Falony G, Darzi Y, Lima-Mendez G, Yunta RG, Okuda S, et al. Species–function relationships shape ecological properties of the human gut microbiome. Nat Microbiol. 2016;1:1–8.
    Article  CAS  Google Scholar 

    39.
    Barberán A, Fenández-Guerra A, Bohannan BJ, Casamayor EO. Exploration of community traits as ecological markers in microbial metagenomes. Mol Ecol. 2012;21:1909–17.
    PubMed  Article  CAS  Google Scholar 

    40.
    R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2018. https://www.R-project.org/.

    41.
    Nakagawa S, Schielzeth H. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods Ecol Evol. 2013;4:133–42.
    Article  Google Scholar 

    42.
    Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550.
    PubMed  PubMed Central  Article  CAS  Google Scholar 

    43.
    Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B. 1995;57:289–300.
    Google Scholar 

    44.
    Goberna M, Navarro‐Cano JA, Valiente‐Banuet A, García C, Verdú M. Abiotic stress tolerance and competition‐related traits underlie phylogenetic clustering in soil bacterial communities. Ecol Lett. 2014;17:1191–201.
    PubMed  Article  Google Scholar 

    45.
    Rodríguez-Echeverría S, Lozano YM, Bardgett RD. Influence of soil microbiota in nurse plant systems. Funct Ecol. 2016;30:30–40.
    Article  Google Scholar 

    46.
    Yahdjian L, Gherardi L, Sala OE. Nitrogen limitation in arid-subhumid ecosystems: a meta-analysis of fertilization studies. J Arid Environ. 2011;75:675–80.
    Article  Google Scholar 

    47.
    Giovannoni SJ, Thrash JC, Temperton B. Implications of streamlining theory for microbial ecology. ISME J. 2014;8:1553–65.
    PubMed  PubMed Central  Article  Google Scholar 

    48.
    Leff JW, Jones SE, Prober SM, Barberán A, Borer ET, Firn JL, et al. Consistent responses of soil microbial communities to elevated nutrient inputs in grasslands across the globe. Proc Natl Acad Sci USA. 2015;112:10967–72.
    CAS  PubMed  Article  Google Scholar 

    49.
    Musto H, Naya H, Zavala A, Romero H, Alvarez-Valı́n F, Bernardi G. Correlations between genomic GC levels and optimal growth temperatures in prokaryotes. FEBS Lett. 2004;573:73–7.
    CAS  PubMed  Article  Google Scholar 

    50.
    Yakovchuk P, Protozanova E, Frank-Kamenetskii MD. Base-stacking and base-pairing contributions into thermal stability of the DNA double helix. Nucleic Acids Res. 2006;34:564–74.
    CAS  PubMed  PubMed Central  Article  Google Scholar 

    51.
    Neilson JW, Quade J, Ortiz M, Nelson WM, Legatzki A, Tian F, et al. Life at the hyperarid margin: novel bacterial diversity in arid soils of the Atacama Desert, Chile. Extremophiles. 2012;16:553–66.
    PubMed  Article  Google Scholar 

    52.
    Lajoie G, Kembel SW. Making the most of trait-based approaches for microbial ecology. Trends Microbiol. 2019;27:814–23.
    CAS  PubMed  Article  Google Scholar 

    53.
    Reich PB. The world-wide ‘fast-slow’ plant economics spectrum: a traits manifesto. J Ecol. 2014;102:275–301.
    Article  Google Scholar 

    54.
    Nemergut DR, Knelman JE, Ferrenberg S, Bilinski T, Melbourne B, Jiang L, et al. Decreases in average bacterial community rRNA operon copy number during succession. ISME J. 2016;10:1147–56.
    CAS  PubMed  Article  Google Scholar 

    55.
    Ortiz-Álvarez R, Fierer N, de Los Ríos A, Casamayor EO, Barberán A. Consistent changes in the taxonomic structure and functional attributes of bacterial communities during primary succession. ISME J. 2018;12:1658–67.
    PubMed  PubMed Central  Article  CAS  Google Scholar 

    56.
    Song H-K, Song W, Kim M, Tripathi BM, Kim H, Jablonski P, et al. Bacterial strategies along nutrient and time gradients, revealed by metagenomic analysis of laboratory microcosms. FEMS Microbiol Ecol. 2017;93:fix114.
    Article  CAS  Google Scholar 

    57.
    Ferenci T. Trade-off mechanisms shaping the diversity of bacteria. Trends Microbiol. 2016;24:209–23.
    CAS  PubMed  Article  Google Scholar 

    58.
    Gray DA, Dugar G, Gamba P, Strahl H, Jonker MJ, Hamoen LW. Extreme slow growth as alternative strategy to survive deep starvation in bacteria. Nat Commun. 2019;10:890.
    PubMed  PubMed Central  Article  CAS  Google Scholar 

    59.
    Trivedi P, Anderson IC, Singh BK. Microbial modulators of soil carbon storage: integrating genomic and metabolic knowledge for global prediction. Trends Microbiol. 2013;21:641–51.
    CAS  PubMed  Article  Google Scholar 

    60.
    Müller DB, Vogel C, Bai Y, Vorholt JA. The plant microbiota: systems-level insights and perspectives. Annu Rev Genet. 2016;50:211–34.
    PubMed  Article  CAS  Google Scholar 

    61.
    Brewer TE, Aronson EL, Arogyaswamy K, Billings SA, Botthoff JK, Campbell AN, et al. Ecological and genomic attributes of novel bacterial taxa that thrive in subsurface soil horizons. MBio. 2019;10:e01318–19.
    CAS  PubMed  PubMed Central  Article  Google Scholar 

    62.
    Price MN, Wetmore KM, Waters RJ, Callaghan M, Ray J, Liu H, et al. Mutant phenotypes for thousands of bacterial genes of unknown function. Nature. 2018;557:503–9.
    CAS  PubMed  Article  Google Scholar 

    63.
    Stewart EJ. Growing unculturable bacteria. J Bacteriol. 2012;194:4151–60.
    CAS  PubMed  PubMed Central  Article  Google Scholar 

    64.
    Pascual-García A, Bell T. Community-level signatures of ecological succession in natural bacterial communities. Nat Commun. 2020;11:1–1.
    Article  CAS  Google Scholar  More

  • in

    Modeling Posidonia oceanica shoot density and rhizome primary production

    Study area and environmental variables
    The data set used in this study included 192 sites in which lepidochronological data and shoot density were acquired between 1994 and 2003. Clearly, the rhizome primary production of P. oceanica was estimated as defined by Pergent-Martini et al.12.
    The spatial coverage of the data set was not uniform across the Italian Seas. In fact, the sampling sites were mainly concentrated in five Italian regions, i.e. Liguria, Tuscany, Lazio, Basilicata and Apulia (Fig. 1).
    Figure 1

    Sampling sites from which field data and indirect measurements have been collected (red circles). Data about several sampling stations are available at each site (N = 6 to 15).

    Full size image

    The environmental variables were all acquired from maps and other related information sources (Table 1), according to the main aim of the study. A detailed explanation of these variables and of the methodology for their acquisition is given in the supplementary materials.
    Table 1 Environmental factors used as predictive variables for developing P. oceanica models.
    Full size table

    Since these environmental factors were used as predictive variables in the modeling procedure, their selection was based on the ecological nature of the modelled processes, taking into account their influence on the latter. For instance, it is well known that depth plays a crucial role in determining the properties of P. oceanica meadows, such as density and productivity, as it is strictly related to other fundamental environmental factors, e.g. light. Therefore, both depth and gradient were considered as predictive variables, as well as the profile of the isobaths, described as either linear, convex or concave. The presence of sources of disturbance, such as sewage discharge or similar pollution, was also taken into account, as an increase in turbidity following an excessive enrichment from nutrient inputs might entail a reduction of water transparency and light penetration, which in turn can alter the ecological proprieties of a P. oceanica meadow. As for the sea floor typologies, i.e. sand, rock and matte, sources of disturbance have been represented as binary variables because of the intention of using only indirect methods for data acquisition, e.g. maps. Clearly, with such types of data source it was possible to perform, with good confidence, only a qualitative assessment. A quantitative coding of those predictive variables would indeed require expensive and time-consuming efforts for field activities, leading to a major drawback of the proposed approach.
    The data set was partitioned into two subsets, i.e. training and test sets, for modeling purposes. Data partitioning represents a critical step in modeling, whose aim is obtaining two subsets that are as much as possible independent from each other, while simultaneously representative of the modelled problem, in order to avoid modeling artifacts and to ensure the applicability of the resulting models18.
    Accordingly, the partitioning was not based on random selection of the data, rather the subsets were obtained on the basis of the following approach. The data were stratified according to depth, i.e. they were sorted on the basis of their depth and assigned to one of the following bathymetric classes, i.e.[0,5] m, (5,10] m, (10,15] m, (15,20] m, (20,25] m, (25,35] m. These classes comprised 16.67%, 23.96%, 27.08%, 17.71%, 9.90% and 4.69% of the total number of records, respectively. Subsequently, within each bathymetric class, about 70% of the data, i.e. n = 136, were assigned to the training set, while the remaining ones, i.e. n = 56, to the test set. While the former subset comprising the majority of the data was used for the training procedure of the Machine Learning algorithm, i.e. Random Forest19, the test subset was only used a posteriori to evaluate model performance.
    The rationale behind the aforementioned approach is that the depth has a paramount ecological role in regulating both P. oceanica shoot density and rhizome primary production, as previously noted. In fact, a wide range of environmental conditions are related to depth, such as light, water movement and sedimentation flows, which in turn strictly affected the structure, the functioning and the ecological condition of P. oceanica meadows. Therefore, using the abovementioned strategy in the data allocation, the inherent variability of the ecological patterns was properly distributed among the subsets, thus ensuring the possibility of obtaining ecologically sound models.
    Random Forest
    The Random Forest (RF) is a Machine Learning technique which fits an ensemble of Classification Trees and combines their predictions into a single model19.
    RF has proven effective in a wide range of applications as it is able to address, for example, both regression and classification problems20, to perform cluster analysis and missing values imputation21,22.
    RF has been used for predicting current and potential future spatial distribution of plant species23, as well as for estimating the marine biodiversity on the basis of the sea floor hardness24. RF has been also applied in ecological applications as a classification tool for the assessment of the vulnerability of P. oceanica meadows over a large spatial scale25, and for land cover classification using remote sensing data26,27.
    This method relies upon one of the main features of Machine Learning methods, namely that an ensemble of ‘weak learners’ usually outperforms a single ‘strong learner’19. As a matter of fact, each Classification Tree in the forest represents a weak learner, i.e. a single model, trained on a partly independent data subset, i.e. on a bootstrap sample. Each Classification Tree provides predictions based on the data contained in its bootstrap sample, and many trees are combined into an ensemble model, i.e. into a ‘forest’. The overall output of a RF is obtained by averaging the outcomes of all the trees for regression applications, while it is based on majority voting for classification problems.
    The diversity of the trees in the forest is ensured by the use of random subsets of data for the tree-building process, i.e. bootstrap samples, as well as by making a random subset of predictive variables available for the tree splitting procedure. These features allow the RF to reduce the correlation among its Classification Trees, while keeping the variance relatively small, thus leading to a more robust model19.
    The selection of a random subset of predictive variables at each split ensures maintaining a certain level of randomness during the tree construction process28, and is necessary for the proper functioning of RF. As a matter of fact, the size of the random subset of predictive variables available for the tree splitting procedure represents a tuning parameter, defined as mtry. The latter together with the minimum number of records to be contained in each leaf, called nodesize, are the main tuning parameters that deeply affect RF performance21,29.
    In its original work, Breiman19 suggested to set the mtry value equal to p/3 for regression applications, being p is the total number of predictors, and tuning it from half to twice its original value. On the other hand, nodesize and ntree (the latter parameter is the total number of Classification Trees in the forest) are more related to the generalization ability of the RF, and to the overall complexity of the model. Growing a very large forest, e.g. ntree  > 500, or growing the trees to achieve a high degree of purity at their leaves, e.g. nodesize  More

  • in

    Large-scale genome sequencing of mycorrhizal fungi provides insights into the early evolution of symbiotic traits

    Main features of mycorrhizal genomes
    We compared 62 draft genomes from mycorrhizal fungi, including 29 newly released genomes, and predicted 9344–31,291 protein-coding genes per species (see “Methods”, Supplementary Information and Supplementary Data 1). This set includes new genomes from the early diverging fungal clades in the Russulales, Thelephorales, Phallomycetidae, and Cantharellales (Basidiomycota), and Helotiales and Pezizales (Ascomycota). We combined these mycorrhizal fungal genomes with 73 fungal genomes from wood decayers, soil/litter saprotrophs, and root endophytes (Fig. 1 and Supplementary Data 2). There was little variation in the completeness of the gene repertoires, based on Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis (coefficient of variation, c.v. = 7.98), despite variation in assembly contiguity (Fig. 1). Genome size varied greatly within each phylum, with genomes of mycorrhizal fungi being larger than those of saprotrophic species (Figs. 1 and 2, and Supplementary Data 2; P  More

  • in

    Drivers of wildfire carbon emissions

    1.
    Cohen, J. et al. Nat. Geosci. 7, 627–637 (2014).
    CAS  Article  Google Scholar 
    2.
    Xiao, J. & Zhuang, Q. Environ. Res. Lett. 2, 044003 (2007).
    Article  Google Scholar 

    3.
    Veraverbeke, S. et al. Nat. Clim. Change 7, 529–534 (2017).
    Article  Google Scholar 

    4.
    Balshi, M. S. et al. J. Geophys. Res. 112, G02029 (2007).
    Article  Google Scholar 

    5.
    Kelly, R., Genet, H., McGuire, A. D. & Hu, F. S. Nat. Clim. Change 6, 79–82 (2016).
    CAS  Article  Google Scholar 

    6.
    Walker, X. J. et al. Nat. Clim. Change https://doi.org/10.1038/s41558-020-00920-8 (2020).

    7.
    Harden, J. W. et al. Glob. Chang. Biol. 6, 174–184 (2000).
    Article  Google Scholar 

    8.
    Harmon, M. E. J. For. 99, 24–29 (2001).
    Google Scholar 

    9.
    Loehman, R. A., Reinhardt, E. & Riley, K. L. For. Ecol. Manag. 317, 9–19 (2014).
    Article  Google Scholar 

    10.
    Fauria, M. M. & Johnson, E. A. J. Geophys. Res.-Biogeo. 111, G04008 (2006).
    Google Scholar 

    11.
    Holden, Z. A. & Jolly, W. M. For. Ecol. Manag. 262, 2133–2141 (2011).
    Article  Google Scholar 

    12.
    Johnstone, J. F., Hollingsworth, T. N., Chapin, F. S. & Mack, M. C. Glob. Chang. Biol. 16, 1281–1295 (2010).
    Article  Google Scholar  More