Field survey and sampling
Field data were collected from 130 study sites spanning a latitudinal gradient of 35.89−50.70° N and a longitudinal gradient of 76.62−122.41°E and covering five provinces across the temperate region in northern China (Xinjiang Autonomous Region, Qinghai Province, Gansu Province, Ningxia Autonomous Region, and Inner Mongolia Autonomous Region; Fig. 2a). Locations for the field study target natural drylands, delineated as areas with aridity level above 0.35 (ref. 30), and represent a large aridity gradient including dry-subhumid (N = 12), semiarid (N = 42), arid (N = 56), and hyperarid (N = 20) regions (Fig. 2a), which are highly vulnerable to expected increases in aridity with human activity and climate change33,71. The aridity level of each site was calculated as 1 – AI, where AI is the ratio of precipitation to potential evapotranspiration38. We obtained AI from the Global Aridity Index and Potential Evapotranspiration Climate database (https://cgiarcsi.community/). The selection of the field sites aimed to minimize the potential impacts of human activity and other disturbances on soil, vegetation, and geomorphological characteristics based on the following three criteria: (i) sites were at least 1 km away from major roads and >50 km from human habitations; (ii) sites were under pristine or unmanaged conditions without visible signs of domestic animal grazing, grass/wood collection, engineering restoration plantings, and infrastructure construction; and (iii) the soil was dry without experiencing rainfall events for at least 3 days prior to sampling. Collectively, our field survey involved a wide range of the abiotic and biotic features of dryland ecosystems across northern China. These sites encompass the 14 soil types, i.e., arenosols, calcisols, cambisols, chernozems, fluvisols, gleysols, greyzems, gypsisols, kastanozems, leptosols, luvisols, phaeozems, solonchaks, and solonetz, and the four main vegetation types44, i.e., typical grassland (dominated by Stipa spp., Leymus spp., Cleistogenes spp., and Agropyron spp.), desert grassland (dominated by Stipa spp., Cleistogenes spp., Suaeda spp., and Artemisia spp.), alpine grassland (dominated by Stipa spp., Leymus spp., Carex spp., and Festuca spp.), and desert (dominated by Reaumuria spp., Salsola spp., Calligonum spp., and Nitraria spp.). Elevation, mean annual temperature, and mean annual precipitation (1970–2000; https://www.worldclim.org/) of the sites varied from 204 to 3,570 m a.s.l. (mean, 1,294 m a.s.l.), from –4.3 to 12.8 °C (mean, 5.0 °C), and from 21 to 453 mm (mean, 195 mm), respectively (Supplementary Table 1).
Field sampling was conducted between June and September from 2015 to 2017 (each site was visited once over this period) following well-established standardized protocols as described in refs. 13,34. In brief, three 30 m × 30 m quadrats were established at each site to represent the local vegetation and soil types that covered an area of no less than 10,000 m2. The cover of perennial vegetation was estimated and all perennial plant species were listed by walking steadily along four 1.5 m × 30 m parallel transects (spaced 8 m apart) located within each quadrat using the belt transect method72. Site-level estimate for perennial plant cover was obtained by averaging the values measured in the 12 transects established. After vegetation survey, we located five 1 m × 1 m (for typical grassland, desert grassland, and alpine grassland) or five 5 m × 5 m (for desert) plots within each quadrat (at each corner and the center of the quadrat) to measure site-level plant aboveground and root biomass (g m−2). In each 1 m × 1 m plot, all grasses and dwarf shrubs were harvested to ground level for measurement of aboveground biomass. Five soil cores (7 cm diameter; 0–40 cm depth) per 1-m2 plot were collected randomly, and the roots were removed using a 1-mm sieve and washed cleanly to measure root biomass. All shoot and root samples were dried to constant weight at 65 °C. In each 5 m × 5 m plot, we recorded the number of individuals per dominant shrub species and canopy cover and height of each individual, thereby estimating aboveground and root biomass according to the allometric models developed in previous studies that were conducted in the same regions as sampled here (see Supplementary Table 9 for details). Based on these measurements, we further estimated BNPP. However, BNPP is typically difficult to observe and measure, especially over large spatial scales and environmental gradients as in this study, because the root system is subject to simultaneous growth and turnover73,74. Across our survey areas, ~77–98% of the precipitation occurs between June and September (during the peak-growing season) corresponding to the period of the highest plant above- and belowground biomass34,35,41,75. Therefore, we argue that BNPP can be estimated approximately at each site by the following equation:
$$frac{{{{{{rm{Aboveground}}}}}},{{{{{rm{biomass}}}}}}}{{{{{{rm{Root}}}}}},{{{{{rm{biomass}}}}}}}cong frac{{{{{{rm{Aboveground}}}}}},{{{{{rm{net}}}}}},{{{{{rm{primary}}}}}},{{{{{rm{productivity}}}}}},({{{{{rm{ANPP}}}}}})}{{{{{{rm{BNPP}}}}}}}$$
(1)
where both aboveground and root biomass are site-level measurements (g m−2). We used normalized difference vegetation index (NDVI) as a metric for ANPP as explained in recent studies in drylands14,33,70. NDVI data were obtained from the moderate resolution imaging spectroradiometer aboard NASA’s Terra satellites (https://neo.sci.gsfc.nasa.gov/). We used the average NDVI values during our sampling dates as a proxy for ANPP at the site level as described in ref. 14.
Five soil cores (0–20 cm depth) per quadrat were then taken randomly under the canopies of the dominant perennial plant species and in bare areas devoid of perennial vegetation, respectively, and then were mixed as one sample for vegetation areas and the other sample for bare ground. When more than one dominant perennial plant species was observed, another three composite samples were collected under the canopies of co-dominant perennial plant species. All vegetation and soil surveys were carried out during the wet season (June to September) when biological activity and productivity are maximal; as such, we do not expect the different sampling times and years or seasonality to be a major factor influencing our conclusions. Collectively, 6–21 soil samples per site were collected, and in total 864 samples were taken and analyzed for each of the seven individual soil functions (see below) and multifunctionality. All soil functions evaluated in the field study were calculated at site level by using a weighted average of the mean values observed in vegetated areas and bare ground by their respective cover13,14,38. After field sampling, the visible pieces of plant material were removed carefully from the soil, which was sieved and divided into three portions. The first portion was air-dried and used for soil organic C, total N, total P, available P, and pH analyses. The second portion was immediately mixed with 2 M KCl and stored at 4 °C for soil ammonium and nitrate analyses. The third portion was immediately frozen at –80 °C for assessing soil microbial diversity.
Microcosm experiment
In addition to the large-scale field study described above, we manipulated soil water availability in a microcosm experiment to evaluate the linkages between moisture content, soil microbial diversity, and multifunctionality. It is important to note that our intention is not to directly compare results between these two different approaches [i.e., in the field, measures of soil functions are related to nutrient pools, which we use to associate soil multifunctionality with both plant and soil microbial diversity, whereas in the microcosm experiment the measures of soil functions are related to process rates such as respiration rate and key enzyme activities (see below), which we use to associate soil multifunctionality with microbial diversity in the absence of plants]. Rather, by using an experimental microcosm approach, we aimed to complement the field study and thus further verify the potential increases in aridity to alter the relationship between soil microbial diversity and multifunctionality in the absence of plants. In parallel with the sampling protocols described above, we collected a greater mass of soil (c. 30 kg) under vegetation canopies from one site [i.e., Jingtai country (37.40°N, 104.26°E; Gansu Province, China)]. Soil type, mean annual temperature, mean annual precipitation, and aridity level (1970–2000; https://www.worldclim.org/) of the site is calcisols, 7.9 °C, 205 mm and 0.81, respectively. Following field sampling, the soil was stored immediately at 4 °C until subsequent processing in the laboratory.
In brief, a total of 30 experimental microcosms composed of 10 moisture levels with three replicates were established under sterile conditions in a closed incubation chamber (Supplementary Fig. 1a). Each microcosm was filled with 1 kg of soil. These microcosms were incubated at 18.5 °C [the annual mean land surface temperature (1981–2010) for the sampling site; http://data.cma.cn/en], and moisture contents were adjusted and artificially maintained at the ten levels respectively equivalent to 3, 5, 8, 10, 20, 40, 60, 80, 100, and 120% field capacity (27.6%) during the duration of the experiment for 30 days. The corresponding moisture content (%) measured at the end of the experiment varied from 2.03 ± 0.034 to 33.57 ± 1.94, which matched well with differences in moisture conditions among a subset of field soil samples (N = 521; Supplementary Fig. 1b). After incubation, the soil was removed from each microcosm; a portion of the soil was immediately frozen at –80 °C for molecular analysis, and the other fraction was air-dried, sieved, and stored at –20 °C for assessing multiple soil functions as described below.
DNA extraction, PCR amplification, and amplicon sequencing
For both the field and experimental studies, we assessed the diversity of soil archaea, bacteria, and fungi using Illumina-based sequencing. Genomic DNA was extracted from 0.5 g of each defrosted soil sample (N = 864 for the field study and N = 30 for the experimental study) using the PowerSoil® DNA Isolation Kit (MO BIO Laboratories, USA) according to the manufacturer’s instructions. For our field study, extracted DNA was pooled at site level, ultimately resulting in 130 composite DNA samples under canopies of vegetation and in bare ground, respectively. Pooling DNA samples may outperform the commonly used method that extracts genomic DNA from mixed soil samples, which could remove large amounts of information on the diversity of soil microorganisms14,22. Negative controls (deionized H2O in place of soil) underwent identical procedures during the extraction to ensure zero contamination in downstream analyses.
The V3−V5 regions of the archaeal 16S rRNA gene were amplified using the primer pair Arch344F and Arch915R. Thermal conditions were composed of an initial denaturation of 3 min at 95 °C, ten cycles of touchdown PCR (95 °C for 30 s, annealing temperatures starting at 60 °C for 30 s then decreasing 0.5 °C per cycles, and 72 °C for 1 min), followed by 25 cycles at 95 °C for 30 s, 55 °C for 30 s, and 72 °C for 1 min, with a final extension at 72 °C for 10 min. The primer pair 338F and 806R was used for amplification of the V3−V4 regions of the bacterial 16S rRNA gene. Thermocycling conditions consisted of 3 min at 95 °C and then subjected to 30 amplification cycles of 30 s denaturation at 95 °C, 30 s annealing at 55 °C, followed by 72 °C for 45 s, and a final extension of 72 °C for 10 min. The fungal internal transcribed spacer (ITS) region 1 was amplified using the primer pair ITS1F and ITS2. The amplification conditions involved denaturation at 95 °C for 3 min, 35 cycles of 94 °C for 1 min, 51 °C for 1 min, and 72 °C for 1 min and a final extension at 72 °C for 10 min. Details of primers for each microbial taxa were given in Supplementary Table 10. These primers contained variable length error-correcting barcodes unique to each sample. All amplification reactions were performed in a total volume of 20 μl containing 4 μl of 5× FastPfu Buffer, 2 μl of 2.5 mM dNTPs, 0.8 μl of both the forward and reverse primers, 10 ng of template DNA, and 0.4 μl of FastPfu DNA Polymerase (TransGen Biotech., China). To mitigate individual PCR reaction biases each sample was amplified in triplicate and pooled together. All PCRs were done with the ABI GeneAmp® 9700 Thermal Cycler (Thermo Fisher Scientific, USA). PCR products were evaluated on 2.0% agarose gel with ethidium bromide staining to ensure correct amplicon length, and were gel-purified using the AxyPrep DNA Gel Extraction Kit (Axygen Biosciences, USA). Purified amplicons were combined at equimolar concentrations and paired-end sequenced (2 × 300 bp) on an Illumina MiSeq platform (Illumina, USA) at the Majorbio Bio-pharm Technology Co., Ltd. (Shanghai, China) according to standard protocols.
Sequence processing
Initial sequence processing was conducted with the QIIME pipeline76. Briefly, reads were quality-trimmed with a threshold of an average quality score higher than 20 over 10 bp moving-window sizes and a minimum length of 50 bp. Paired-end reads with at least 10 bp overlap and <5% mismatches were merged into full length sequences by using the FLASH program77. Sequences were de-multiplexed into samples based on their barcodes and a further round of quality control was performed to remove sequences containing any ambiguous bases or with a phred score <20 over the entire read length. 16S rDNA sequences were checked for chimeras by using the UCHIME algorithm78 against “Gold” database (http://www.drive5.com/uchime/gold.fa). Chimeric ITS sequences were detected de novo by exploiting abundance data using UCHIME78. Identified chimeric sequences were eliminated from the dataset before downstream analyses, ultimately resulting in a total of 7,040,043/2,180,656 (field and experimental datasets, respectively), 7,141,265/2,349,035 and 9,045,735/3,831,302 high-quality chimaera-free sequences for archaea, bacteria, and fungi, respectively. Then, the sequences were clustered into OTUs according to 97% pairwise identity with the UCLUST algorithm in the USEARCH package79. Singleton OTUs were discarded. We determined the taxonomic identity of representative sequences from each OTU using the RDP Classifier80 against the SILVA database v.128 release for prokaryotic OTUs or the UNITE database v.7.0 release for fungal OTUs. Sequences not classified at kingdom level or identified as non-microorganisms were removed. To correct sampling effort, the resultant OTU abundance tables were subsequently rarefied to the lowest number of sequences (10,707/27,265, 14,317/23,916, and 25,263/40,329 for archaea, bacteria, and fungi, respectively) found within an individual sample. Our resampled dataset included a total of 1,763/2,305, 14,060/5,604, and 9,463/1,786 OTUs for archaea, bacteria, and fungi, respectively.
Measurement of individual soil functions
For the field study, we measured in all soil samples the following seven variables related to nutrient pools: DNA concentration, soil organic C, total N, ammonium, nitrate, total P, and available P. These variables are key soil properties involving stocks of matter and energy that are representative of slow abiotic and biotic processes and can act as appropriate indicators of net process rates at long time scales such as soil C sequestration, soil nutrient storage capacity, and the build-up of soil fertility2,3,4. Variables such as these are the most commonly measured indicators for multifunctionality studies conducted in dryland ecosystems4,13,14,49,81,82 and are considered to be critical determinants of soil functioning in natural drylands13,33,38,83. Collectively, these variables reflect multiple ecosystem functional categories including nutrient cycling, climate regulation, and soil properties and fertility which fall within the “supporting” and “regulating” ecosystem service categories4,43. In brief, organic C, total N, total P, and available P are good proxies of C, N, and P storage and also act as surrogates for C, N, and P availability for plants and microorganisms in drylands13,38. In particular, organic C is often used as an indicator of soil C sequestration4, and N and P often limit the growth of plants and microorganisms, and ultimately food, fiber, and biomass production in drylands38. Ammonium and nitrate are the fraction of the soil N pool that is more readily available for plant and microbial uptake and are produced by microorganism-mediated processes such as nitrification, mineralization, and atmospheric-N fixation84. DNA concentration has been used as a powerful indicator of surface soil microbial biomass14,45,67, which acts as a good substitute of microbial activity4. Across arid and semiarid regions in northern China, DNA concentration has recently been reported to be strongly related to soil microbial-biomass C and N (both Pearson’s r > 0.97) estimated by the chloroform-fumigation extraction method67. Moreover, as a molecule rich in N and P, DNA could serve as a source of P, as well as C and energy, for soil microorganisms under nutrient-limiting conditions85.
The determination of soil organic C was based on the Walkley–Black chromic acid wet oxidation method86. Total soil N was measured by the Kjeldahl sulfuric acid-digestion method87. Total soil P was determined by a molybdate colorimetric test after perchloric acid digestion88. Soil available P was assessed following a 0.5 M NaHCO3 extraction89. Ammonium concentration was measured colorimetrically by the indophenol blue method90. Nitrate was first reduced to nitrite with hydrazine sulfate, and its concentration was determined from 2 M KCl extracts91. The concentration of DNA extracted as described above was measured with a NanoDropTM 2000 UV-Vis spectrophotometer (Thermo Fisher Scientific, USA).
For the microcosm study, we measured in all soil samples seven variables: DNA concentration, activity of alkaline phosphatase (P mineralization), invertase (sucrose degradation), urease (N mineralization), β-glucosidase (starch degradation), catalase (hydrogen peroxide decomposition), and CO2 fluxes (respiration). Extracellular enzymes such as those we measured are produced by soil microorganisms and the proximate agents of processes such as the stabilization and destabilization of soil organic matter, and measures of their activities are also considered a good indicator of soil fertility and microbial nutrient demand. In addition, respiration can be used as a proxy of soil microbial activity. Altogether, these variables are involved in microbial activity, the degradation and mineralization of organic matter, and nutrient cycling in soil13,46,81,92. For our microcosm study, we did not measure those soil variables selected for our field study, except for DNA concentration, because those variables indicative of net process rates over long time periods were expected to be less sensitive to changes in soil moisture content and thus to be less affected during the short-term experiment (i.e., 30 days)81. Soil alkaline phosphatase, invertase, and urease activities were measured from air-dried soil as described in ref. 93. The activity of β-glucosidase was assayed following the procedure described in ref. 13. Soil catalase activity was determined by back-titrating residual H2O2 with KMnO4 according to ref. 94. Measurement of DNA concentration was done as described above. Soil CO2 fluxes were measured at the end of experiment using a LI-COR 6400 portable CO2 infrared gas analyzer with a 6400-09 soil respiration chamber (LI-COR Inc., Lincoln, NE, USA).
Trade-offs and redundancy among soil functions
For the field study, we evaluated potential trade-offs among individual soil functions that perform at high levels while others perform at low levels3. To do so, we calculated Pearson’s correlation coefficients between each pair of individual soil functions. Among a total of 21 combinations we detected significant positive correlations in 18 of them and none showed a significant negative correlation (Supplementary Fig. 18a), suggesting no trade-offs between them. Additionally, the soil functions that were strongly correlated imply some degree of redundancy3,19,82,95. However, in only two cases (i.e., total N vs. DNA concentration and total N vs. organic C), correlations had r values higher than 0.7 (refs. 19,82), indicating that the degree of redundancy was not very high (Supplementary Fig. 18a).
Assessment of soil multifunctionality
We assessed soil multifunctionality using three basic methods: single-function, averaging, and multiple-threshold approaches, all of which have frequently been employed to measure multifunctionality in recent literature13,14,22,26,45 and give complementary information for quantifying multifunctionality3,10. The averaging approach aims to combine a collection of soil functions into a single index that quantifies the average level of multiple soil functions. To obtain a quantitative multifunctionality index for each field site or experimental microcosm, we first normalized (log10-transformed when needed) and standardized each of the evaluated soil functions using the Z-score transformation. The Z-scores of the soil functions were then averaged to obtain a multifunctionality index for each field site or experimental microcosm13. For the field study, all selected individual soil functions had significant positive correlations with the multifunctionality index (Supplementary Fig. 18a). Although the averaged multifunctionality index has good statistical properties and provides an intuitively interpretable measure of the ability of an ecosystem to deliver multiple functions simultaneously, it ignores potential trade-offs among functions and assumes the substitutability of functions3,10,13. To address these limitations, we also quantified multifunctionality using the multiple-threshold approach for the field study. Multiple-threshold approach captures the number of functions that simultaneously exceed different thresholds of the maximum observed value of each function and evaluates whether more (or fewer) functions are performing simultaneously at high (or low) levels10. Following the recommendation as described in ref. 10, we determined the maximum level of functioning as the average of the top four values measured for the respective soil function among all field sites. Multiple-threshold multifunctionality was then calculated as the number of functions surpassing a series of consecutive thresholds (from 1 to 99% at 1% intervals) of the maximum of each function. This approach provides the following key indices to evaluate the relationships between biodiversity and multifunctionality: Tmin (the lowest threshold where biodiversity–multifunctionality relationships become significant), Tmax (the highest threshold beyond which biodiversity–multifunctionality relationships become nonsignificant) and Tmde (the threshold where biodiversity shows a strongest positive or negative association with multifunctionality). Accordingly, Mmin, Mmax, and Mmde indicate the number of functions (i.e., multiple-threshold multifunctionality) achieving at the respective thresholds10. Thus, it can be concluded that biodiversity exhibits a strong and positive association with multifunctionality if Tmin is low and the rest of the five indices are high; conversely, biodiversity exhibits a strong and negative association with multifunctionality if Tmax is high and the rest of the five indices are low10. Finally, we also realize that both multifunctionality metrics described above aggregate individual functions to characterize overall soil functioning, and thus may obscure relationships between biodiversity and key functions3. To address this problem, we incorporated the single-function approach for the field study, which facilitates mechanistic understanding of multifunctionality and the interpretations of results generated by using those aggregated metrics10. However, the relationships between soil multifunctionality and aridity or biodiversity evaluated by using both the single-function and multiple-threshold approaches were very similar to those obtained with the averaging approach, and hence we always used the averaged multifunctionality index as a metric of soil multifunctionality in the main text to make our results easier to compare.
Metrics of plant and soil microbial diversity
For the field study, we used the total number of plant species (plant species richness), archaeal OTUs (soil archaeal richness), bacterial OTUs (soil bacterial richness), and fungal OTUs (soil fungal richness) tallied at each field site as a surrogate of plant, archaeal, bacterial, and fungal diversity, respectively. Given that fungi typically include several guilds (e.g., saprotrophs, pathogens, and symbionts), we further parsed fungal OTUs into these trophic modes based on their taxonomic assignments using the FUNGuild tool96. Following the recommendation as described in ref. 96, we only kept those fungal OTUs with mode assignments that are “highly probable” and “probable”, so as not to overinterpret our results ecologically. We also calculated the diversity of each fungal trophic mode as the total number of OTUs within each mode tallied at each field site. Furthermore, for both the field and microcosm studies, we calculated a single index (i.e., soil microbial diversity index) to represent the overall changes in soil microbial diversity20,22,45. In brief, we first standardized all components of microbial diversity metrics (i.e., soil archaeal, bacterial, and fungal richness) by the Z-score transformation (overall mean of 0 and standard deviation of 1). The Z-scores of the diversity metrics were then averaged to obtain the soil microbial diversity index for each field site or experimental microcosm.
Statistical analyses
Field study
Before statistical analyses, all the data were tested for normality. All soil functions, elevation and plant species richness were log10-transformed to improve the normality and homoscedasticity of residuals. We first evaluated the responses of each of the seven individual soil functions and multifunctionality to increasing aridity using linear and nonlinear [quadratic and generalized additive models (GAMs)] regressions (Fig. 2b–i), and the model with lower AIC value was selected in each case [differences in AIC (ΔAIC) values >2 indicate that the models are different; Supplementary Table 2]. We further assessed whether soil multifunctionality responded more rapidly to aridity than did any individual soil functions. To this end, we explored the presence of aridity thresholds for those relationships that were better fitted by nonlinear regressions (Fig. 2b–i) using the standard protocols developed in ref. 33. The presence of an aridity threshold means that once an aridity level is reached, a given variable either changes abruptly its value (i.e., discontinuous threshold) or its relationship with aridity (i.e., continuous threshold). Hence, a lower aridity threshold indicates that a given variable is more vulnerable to increasing aridity than are others33. We further fitted step (a linear regression that modifies only intercept at a given aridity level) and stegmented (showing changes both in intercept and slope at a given aridity level) regressions for the determination of discontinuous thresholds, and segmented (exhibiting changes only in slope at a given aridity level) regressions for continuous thresholds. Each of these models yields a change point (i.e., threshold) describing the aridity level that evidences the shift in a given nonlinear relationship evaluated. We also used AIC to choose the best threshold model and the corresponding threshold in each case (Supplementary Table 2).
We then employed analysis of variance based on type-I sum of squares in a linear mixed-effects model (Eq. (2); Table 1) to test the relationships between the multiple biotic (BNPP, plant species richness, and the soil microbial diversity index) and abiotic (aridity, soil pH, and soil clay content) factors and soil multifunctionality:
$${{{{{rm{Soil}}}}}},{{{{{rm{multifunctionality}}}}}} sim; {{{{{rm{Year}}}}}}+{{{{{rm{Plant}}}}}},{{{{{rm{species}}}}}},{{{{{rm{richness}}}}}} quad+,{{{{{rm{Soil}}}}}},{{{{{rm{microbial}}}}}},{{{{{rm{diversity}}}}}},{{{{{rm{index}}}}}} quad+{{{{{rm{Plant}}}}}},{{{{{rm{species}}}}}},{{{{{rm{richness}}}}}}times {{{{{rm{Soil}}}}}},{{{{{rm{microbial}}}}}},{{{{{rm{diversity}}}}}},{{{{{rm{index}}}}}}+{{{{{rm{Aridity}}}}}} quad+,{{{{{rm{Aridity}}}}}}times {{{{{rm{Plant}}}}}},{{{{{rm{species}}}}}},{{{{{rm{richness}}}}}} quad+,{{{{{rm{Aridity}}}}}}times {{{{{rm{Soil}}}}}},{{{{{rm{microbial}}}}}},{{{{{rm{diversity}}}}}},{{{{{rm{index}}}}}}+{{{{{rm{BNPP}}}}}}+{{{{{rm{Soil}}}}}},{{{{{rm{pH}}}}}}+{{{{{rm{Soil}}}}}},{{{{{rm{clay}}}}}},{{{{{rm{content}}}}}} quad+,{{{{{rm{Elevation}}}}}}+{{{{{rm{Latitude}}}}}}+{{{{{rm{Longitude}}}}}}+(1|{{{{{rm{Soil}}}}}},{{{{{rm{type}}}}}})+(1|{{{{{rm{Vegetation}}}}}},{{{{{rm{type}}}}}})$$
(2)
where × indicates an interaction term. We obtained information on soil clay content (%) from the SoilGrids system (https://soilgrids.org/), and eliminated variation due to different sampling years by first entering the term “Year” into the statistical model41. The elevation, latitude, and longitude of the study sites were included to account for the spatial structure of our dataset13,70. To account for the similarities of soil and vegetation types among study sites we included “Soil type” and “Vegetation type” as random terms.
We further simplified the Eq. (2) to focus only on the relationships between aridity, biodiversity, and soil multifunctionality (Eq. (3); Supplementary Fig. 5). We did so because excluding additional biotic and abiotic factors did not change qualitatively the main results presented here (Table 1 and Supplementary Fig. 5), and therefore we used the simplest model to test our hypotheses more clearly. Our simplified model was:
$${{{{{rm{Soil}}}}}},{{{{{rm{multifunctionality}}}}}} sim {{{{{rm{Year}}}}}}+{{{{{rm{Plant}}}}}},{{{{{rm{species}}}}}},{{{{{rm{richness}}}}}} quad+,{{{{{rm{Soil}}}}}},{{{{{rm{microbial}}}}}},{{{{{rm{diversity}}}}}},{{{{{rm{index}}}}}} quad+,{{{{{rm{Aridity}}}}}}+{{{{{rm{Aridity}}}}}}times {{{{{rm{Plant}}}}}},{{{{{rm{species}}}}}},{{{{{rm{richness}}}}}} quad+,{{{{{rm{Aridity}}}}}}times {{{{{rm{Soil}}}}}},{{{{{rm{microbial}}}}}},{{{{{rm{diversity}}}}}},{{{{{rm{index}}}}}} quad+,{{{{{rm{Aridity}}}}}}times {{{{{rm{Plant}}}}}},{{{{{rm{species}}}}}},{{{{{rm{richness}}}}}}times {{{{{rm{Soil}}}}}},{{{{{rm{microbial}}}}}},{{{{{rm{diversity}}}}}},{{{{{rm{index}}}}}} quad+,(1|{{{{{rm{Soil}}}}}},{{{{{rm{type}}}}}})+(1|{{{{{rm{Vegetation}}}}}},{{{{{rm{type}}}}}})$$
(3)
To evaluate how the biodiversity–multifunctionality relationships varied along aridity gradients, we conducted a moving-window analysis as detailed in ref. 69. Briefly, we performed the linear mixed-effects model described in Eq. (3) for a subset window of 60 study sites with the lowest aridity values (this number of sites provided sufficient statistical power for our model), and repeated the same calculations as many times as sites remained (i.e., 70). We then bootstrapped the standardized coefficients of each fixed term within each subset window, which was matched to the average value of aridity across the 60 sites. We fitted linear and nonlinear regressions to the bootstrapped coefficients of biodiversity and its interaction with aridity along aridity gradients (Fig. 3a, b and Supplementary Table 2), and identified the aridity thresholds for the changes in the coefficients of biodiversity (Fig. 3a and Supplementary Table 2) using the same procedure already described above. To provide further support for the aridity thresholds identified here, we also assessed the significance of the bootstrapped standardized coefficients of biodiversity and its interaction with aridity at 95% confidence intervals for each subset window (Fig. 3e). Before fitting threshold regressions, we evaluated whether the variables followed either a unimodal or bimodal distribution using the fitgmdist function in MATLAB (The MathWorks Inc., USA). Our results showed that all variables used for threshold detection presented unimodal distributions (Supplementary Table 11), suggesting that the three threshold regressions mentioned above (i.e., segmented, step, and stegmented) are appropriate in all cases33. We used the chngpt and gam packages in R (http://cran.r-project.org/) to fit segmented/step/stegmented and GAM regressions, respectively. To further check the validity of the thresholds identified, we bootstrapped linear regressions at both sides of each threshold for each variable. We then used the nonparametric Mann–Whitney U-test to compare the slope and the predicted value evaluated before and after each threshold. In all cases, we found significant differences in both of these two parameters (Fig. 3c, d and Supplementary Figs. 2, 3, 6).
Given a clear shift in the relationships between plant or microbial diversity and soil multifunctionality occurring at a threshold around an aridity level of 0.8 (Fig. 3), we further used OLS regressions to clarify the relationships between each component of plant or microbial diversity and soil multifunctionality in less and more arid regions separately, as well as across all sites (Fig. 4). To do so, we split our study sites into two groups: sites with aridity <0.8 (less arid regions; N = 54) and >0.8 (more arid regions; N = 76). Moreover, we fitted the mixed-effects model described in Eq. (2) for less and more arid regions separately to ensure the robustness of these bivariate correlations when accounting for multiple biotic and abiotic factors simultaneously, with the exception of using all components of microbial diversity metrics (i.e., soil archaeal, bacterial, and fungal richness) instead of the soil microbial diversity index in the models (Supplementary Table 4). All linear mixed-effects models were performed using the R package lme4. We used a variance inflation factor (VIF) to evaluate the risk of multicollinearity, and selected variables with VIF <10 in all cases97. Also, we evaluated whether a fitted mixed-effects model is singular (i.e., variance of any random term is close to zero) using the isSingular function. Moreover, we extracted the marginal (variance explained by fixed factors) and conditional (variance explained by fixed and random factors) R2 values using the R package piecewiseSEM.
We next used SEMs to compare the hypothesized direct and indirect relationships between aridity, soil pH, soil clay content, BNPP, plant and microbial diversity, and soil multifunctionality in less and more arid regions (Fig. 5). The first step in SEM requires establishing an a priori model based on the hypothesized causal relationships among these variables (Supplementary Fig. 17a). Before modeling, bivariate correlations were checked between all variables to ensure that a linear model was appropriate (Supplementary Fig. 18b, c). We then parameterized our models using the grouped dataset and tested their respective goodness-of-fit statistics. Here we used the chi-squared test (χ2) and the root mean square error of approximation (RMSEA). Furthermore, as some of the variables didn’t satisfy normality, the fit of the model was confirmed using the Bollen–Stine bootstrap test. All of these goodness-of-fit metrics revealed an acceptable fit of our a priori model, with the exception of removing the relationship between soil pH and BNPP with a coefficient close to zero for the SEM of more arid regions to improve its model fit (Fig. 5 and Supplementary Table 5).
We must note, however, that total soil P is typically considered to be controlled mostly by abiotic processes such as weathering of rocks rather than biotic processes in dryland ecosystems38. Also, total soil N was strongly correlated with DNA concentration and organic C in our dataset (Supplementary Fig. 18a), which may provide redundant information for the multifunctionality metrics used3,95. To ensure that these were not influencing our results, we repeated above analyses after excluding total soil N and P. These two sets of analyses provided very similar results (Supplementary Figs. 19–28 and Supplementary Tables 2, 3, 6–8), thus these issues do not affect the conclusions of this study.
Finally, we adopted a space-for-time substitution approach to quantitatively estimate the future changes in areas that are likely to cross the 0.8 aridity threshold identified in drylands across northern China (Fig. 6). To this end, we located both the expanding and shrinking areas with aridity levels crossing 0.8 projected for 2100 relative to 1970–2000 under two different scenarios (i.e., RCP 4.5 and 8.5, assuming saturated and exponential increases of CO2 emissions, respectively) based on the aridity maps provided by Huang et al.71. Because the AI dataset (https://cgiarcsi.community/) has a spatial resolution of 30 arc-sec, we downscaled the data to a 0.5° × 0.5° resolution to match the aridity maps. And we excluded those areas that even will cross 0.8 aridity threshold by 2100 but that are not drylands today from our analyses to avoid overestimating our results33. All maps were visualized in ArcGIS 10.2 (ESRI, USA).
Microcosm study
Before analyses, soil respiration was log10-transformed to improve the normality and homoscedasticity of residuals. We first evaluated the relationship between moisture content and soil multifunctionality using OLS regression (Supplementary Fig. 31a). We then used OLS regressions to assess the relationships of each component of microbial diversity (i.e., soil archaeal, bacterial, and fungal OTU richness) and the soil microbial diversity index with soil multifunctionality across the ten different moisture levels, as well as at both low and high moisture levels. Corresponding to the 0.8 aridity threshold identified in the field study, we selected 20% field capacity to split our dataset into two groups: 3−20% field capacity (low moisture levels) and 40−120% field capacity (high moisture levels). We did so because the value of moisture content (i.e., 6.09 ± 0.39%) measured at 20% field capacity is within the maximum observed values of moisture content of field soil samples collected in arid regions (Supplementary Fig. 1). Thus, the selected moisture level could be closer to that of the lower boundary of arid regions (i.e., 0.8 aridity level) under field conditions. We only presented the result for soil bacterial richness (Supplementary Fig. 31b) because all of the other relationships were nonsignificant (P > 0.05). Finally, we also used SEMs to compare the hypothesized direct and indirect relationships between moisture content, microbial diversity, and soil multifunctionality at low and high moisture levels (see an a priori model in Supplementary Figs. 17b and 31c, d). Test of goodness-of-fit for SEMs were same as described above. All the SEM analyses were conducted using AMOS 21.0 (IBM SPSS Inc., USA). Data and code used to perform above analyses are available in figshare98.
Reporting Summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Source: Ecology - nature.com