Field experiments underestimate aboveground biomass response to drought
Literature search and study selectionA systematic literature search was conducted in the ISI Web of Science database for observational and experimental studies published from 1975 to 13 January 2020 using the following search terms: TOPIC: (grassland* OR prairie* OR steppe* OR shrubland* OR scrubland* OR bushland*) AND TOPIC: (drought* OR ‘dry period*’ OR ‘dry condition*’ OR ‘dry year*’ OR ‘dry spell*’) AND TOPIC: (product* OR biomass OR cover OR abundance* OR phytomass). The search was refined to include the subject categories Ecology, Environmental Sciences, Plant Sciences, Biodiversity Conservation, Multidisciplinary Sciences and Biology, and the document types Article, Review and Letter. This yielded a total of 2,187 peer-reviewed papers (Supplementary Fig. 1). At first, these papers were screened by title and abstract, which resulted in 197 potentially relevant full-text articles. We then examined the full text of these papers for eligibility and selected 87 studies (43 experimental, 43 observational and 1 that included both types) on the basis of the following criteria:
(1)
The research was conducted in the field, in natural or semi-natural grasslands or shrublands (for example, artificially constructed (seeded or planted) plant communities or studies using monolith transplants were excluded). We used this restriction because most reports on observational droughts are from intact ecosystems, and experiments in disturbed sites or using artificial communities would thus not be comparable to observational drought studies.
(2)
In the case of observational studies, the drought year or a multi-year drought was clearly specified by the authors (that is, we did not arbitrarily extract dry years from a long-term dataset). Please note that some observational data points are from control plots of experiments (of any kind), where the authors reported that a drought had occurred during the study period. We did not involve gradient studies that compare sites of different climates, which are sometimes referred to as ‘observational studies’.
(3)
The paper reported the amount or proportion of change in annual or growing-season precipitation (GSP) compared with control conditions. We consistently use the term ‘control’ for normal precipitation (non-drought) year or years in observational studies and for ambient precipitation (no treatment) in experimental studies hereafter. Similarly, we use the term ‘drought’ for both drought year or years in observational studies and drought treatment in experimental studies. In the case of multi-factor experiments, where precipitation reduction was combined with any other treatment (for example, warming), data from the plots receiving drought only and data from the control plots were used.
(4)
The paper contained raw data on plant production under both control and drought conditions, expressed in any of the following variables: ANPP, aboveground plant biomass (in grassland studies only) or percentage plant cover. In 79% of the studies that used ANPP as a production variable, ANPP was estimated by harvesting peak or end-of-season AGB. We therefore did not distinguish between ANPP and AGB, which are referred to as ‘biomass’ hereafter. We included the papers that reported the production of the whole plant community, or at least that of the dominant species or functional groups approximating the abundance of the whole community.
(5)
When multiple papers were published on the same experiment or natural drought event at the same study site, the most long-term study including the largest number of drought years was chosen.
In addition to the systematic literature search, we included 27 studies (9 experimental, 17 observational and 1 that included both types) meeting the above criteria from the cited references of the Web of Science records selected for our meta-analyses, and from previous meta-analyses and reviews on the topic. In total, this resulted in 114 studies (52 experimental, 60 observational and 2 that included both types; Supplementary Note 9, Supplementary Fig. 2 and ref. 25).Data compilationData were extracted from the text or tables, or were read from the figures using Web Plot Digitizer26. For each study, we collected the study site, latitude, longitude, mean annual temperature (MAT) and precipitation (MAP), study type (experimental or observational), and drought length (the number of consecutive drought years). When MAT or MAP was not documented in the paper, it was extracted from another published study conducted at the same study site (identified by site names and geographic coordinates) or from an online climate database cited in the respective paper. We also collected vegetation type—that is, grassland when it was dominated by grasses, or shrubland when the dominant species included one or more shrub species (involving communities co-dominated by grasses and shrubs). Data from the same study (that is, paper) but from different geographic locations or environmental conditions (for example, soil types, land uses or multiple levels of experimental drought) were collected as distinct data points (but see ‘Statistical analysis’ for how these points were handled). As a result, the 114 published papers provided 239 data points (112 experimental and 127 observational)25.For the observational studies, normal precipitation year or years specified by the authors was used as the control. If it was not specified in the paper, the year immediately preceding the drought year(s) was chosen as the control. When no data from the pre-drought year were available, the year immediately following the drought year(s) (14 data points) or a multi-year period given in the paper (22 data points) was used as the control. For the experimental studies, we also collected treatment size (that is, rainout shelter area or, if it was not reported in the paper, the experimental plot size).For the calculation of drought severity, we used yearly precipitation (YP), which was reported in a much higher number of studies than GSP. We extracted YP for both control (YPcontrol) and drought (YPdrought). For the observational studies, when a multi-year period was used as the control or the natural drought lasted for more than one year, precipitation values were averaged across the control or drought years, respectively. Consistently, in the case of multi-year drought experiments, YPcontrol and YPdrought were averaged across the treatment years. When only GSP was published in the paper (63 of 239 data points), we used this to obtain YP data as follows: we regarded MAP as YPcontrol, and YPdrought was calculated as YPdrought = MAP − (GSPcontrol − GSPdrought). From YPcontrol and YPdrought data, we calculated drought severity as follows: (YPdrought − YPcontrol)/YPcontrol × 100.For production, we compiled the mean, replication (N) and, if the study reported it, a variance estimate (s.d., s.e.m. or 95% CI) for both control and drought. In the case of multi-year droughts, data only from the last drought year were extracted, except in five studies (17 data points) where production data were given as an average for the drought years. When both biomass and cover data were presented in the paper, we chose biomass. For each study, we consistently considered replication as the number of the smallest independent study unit. When only the range of replications was reported in a study, we chose the smallest number.To quantify climatic aridity for each study site, we used an aridity index (AI), calculated as the ratio of MAP and mean annual PET (AI = MAP/PET). This is a frequently used index in recent climate change research27,28. AI values were extracted from the Global Aridity Index and Potential Evapotranspiration (ET0) Climate Database v.2 for the period of 1970–2000 (aggregated on annual basis)29.Because we wanted to prevent our analysis from being distorted by a strongly unequal distribution of studies between the two study types regarding some potentially important explanatory variables, we left out studies from our focal meta-analysis in three steps. First, we left out studies that were conducted at wet sites—that is, where site AI exceeded 1. The value of 1 was chosen for two reasons: above this value, the distribution of studies between the two study types was extremely uneven (22 experimental versus 2 observational data points with AI > 1)25, and the AI value of 1 is a bioclimatically meaningful threshold, where MAP equals PET. Second, we left out shrublands, because we had only 14 shrubland studies (out of 105 studies with AI More