Empirical data
We applied our theory to two datasets (Table 1): the plant survey dataset and the biodiversity-manipulated experiment dataset. The plant survey dataset contains nine sites of long-term grassland experiments across the United States (see also Hallett et al.10, and Zhao et al.23). Five of nine sites are from the Long Term Ecological Research (LTER) network (see Table 1). Plant abundances were measured either as biomass or as percent cover. In percent-cover cases, summed values can exceed 100% due to vertically overlapping canopies. All sites were sampled annually and were spatially replicated. We only used data of the plant survey dataset from unmanipulated control plots. Methods for data collection were constant over time.
The biodiversity-manipulated experimental dataset comprises two long-term grassland experiments, BigBio and BioCON, at the Cedar Creek Ecosystem Science Reserve. Both experiments directly manipulated plant species number (1, 2, 4, 8, 16 for BigBio; and 1, 4, 9, 16 for BioCON). BioCON also contains different treatment levels for nitrogen and atmospheric CO2, but here only data from the ambient CO2 and ambient N treatments were used. We excluded plots with only one species. BigBio comprises 125 plots over 17 years, and BioCON comprises 59 plots over 22 years (Table 1).
Theory
Let xi(t) denote the biomass of species i = 1, …, S at time t = 1, …, t and let μi = mean (xi (t)), σi = ({{mbox{sd}}})(xi (t)), and ({v}_{i}={sigma }_{i}^{2}) be the mean, standard deviation and variance of species i, computed through time. Let vij = cov (({x}_{i}left(tright),, {x}_{j}left(tright))) be the covariance, through time, of the dynamics of species i and j. Let xtot (left(tright)={sum }_{i}{x}_{i}(t)), ({mu }_{{{mbox{tot}}}}={sum }_{i}{mu }_{i}), ({v}_{{{mbox{tot}}}}={sum }_{i,j}{v}_{{ij}}), and ({{{{{{rm{sigma }}}}}}}_{{{{{{rm{tot}}}}}}}=sqrt{{v}_{{{{{{rm{tot}}}}}}}}). When population time series are uncorrelated, ({v}_{{{{{{rm{tot}}}}}}}={sum }_{i}{v}_{i}).
As defined previously10,15, community stability is the inverse coefficient of variation of ({x}_{{{mbox{tot}}}}left(tright)), ({S}_{{{{{{rm{com}}}}}}}={mu }_{{{{{{rm{tot}}}}}}}/{sigma }_{{{{{{rm{tot}}}}}}}). Population stability is the inverse of weighted-average population variability9, ({sum }_{i}frac{{mu }_{i}}{{mu }_{{{{{{rm{tot}}}}}}}}{{CV}}_{i}={sum }_{i}frac{{mu }_{i}}{{mu }_{{{{{{rm{tot}}}}}}}}frac{{sigma }_{i}}{{mu }_{i}}={sum }_{i}frac{{sigma }_{i}}{{mu }_{{{{{{rm{tot}}}}}}}}), i.e, ({S}_{{pop}}={mu }_{{{{{{rm{tot}}}}}}}/{sum }_{i}{sigma }_{i}). The ratio of community stability over population stability is the Loreau-de Mazancourt asynchrony index14, Φ = ({sum }_{i}{sigma }_{i}/{sigma }_{{{{{{rm{tot}}}}}}}), so that
$${S}_{{{{{{rm{com}}}}}}}=varPhi {S}_{{{{{{rm{pop}}}}}}}.$$
(1)
Now we suppose a hypothetical community with the same species-level variances and means as the original community but with species covariances equal to zero. Then, (1) becomes Scom_ip = (SAE)Spop, where ({S}_{{{{{{rm{com}}}}}}_{{{{{rm{ip}}}}}}}=frac{{mu }_{{{{{{rm{tot}}}}}}}}{sqrt{{sum }_{i}{v}_{i}}}=frac{{mu }_{{{{{{rm{tot}}}}}}}}{sqrt{{sum }_{i}{sigma }_{i}^{2}}}) is the value of community stability in the case of uncorrelated or independent populations and SAE is the component of Φ due to statistical averaging (here, “ip” stands for “independent populations”). The equation Scom_ip = (SAE)Spop can be interpreted as a definition of SAE. We then have
$$SAE=frac{{S}_{{{{{{rm{com}}}}}}_{{{{{rm{ip}}}}}}}}{{S}_{{{{{{rm{pop}}}}}}}}=frac{{sum }_{i}{sigma }_{i}}{sqrt{{sum }_{i}{sigma }_{i}^{2}}}.$$
(2)
The compensatory effect is then the rest of Φ, i.e.,
$$CPE=frac{{S}_{{{{{{rm{com}}}}}}}}{{S}_{{{{{{rm{pop}}}}}}}times SAE}=frac{{sum }_{i}{sigma }_{i}}{{sigma }_{{{{{{rm{tot}}}}}}}left({sum }_{i}{sigma }_{i}/sqrt{{sum }_{i}{sigma }_{i}^{2}}right)}=frac{sqrt{{sum }_{i}{sigma }_{i}^{2}}}{{sigma }_{{{{{{rm{tot}}}}}}}}.$$
(3)
Considering the classic variance ratio ({{{{{rm{varphi }}}}}}=frac{{V}_{{{{{{rm{tot}}}}}}}}{{sum }_{i}{V}_{i}}=frac{{sigma }_{{{{{{rm{tot}}}}}}}^{2}}{{sum }_{i}{sigma }_{i}^{2}}), our CPE is (1/sqrt{varphi }). Values CPE > 1 (respectively, <1) correspond to greater (resp., lesser) community stability than would be expected if dynamics of different taxa were independent, reflecting compensatory (resp., synchronous) dynamics. We also have
$$SAE=frac{{sum }_{i}{sigma }_{i}}{sqrt{{sum }_{i}{sigma }_{i}^{2}}}=sqrt{frac{1}{{sum }_{i}{sigma }_{i}^{2}/{({sum }_{i}{sigma }_{i})}^{2}}}=sqrt{frac{1}{{sum }_{i}{({sigma }_{i}/{sum }_{i}{sigma }_{i})}^{2}}}=sqrt{frac{1}{{sum }_{i}{({p}_{i})}^{2}}}.$$
(4)
where ({p}_{i}={sigma }_{i}/{sum }_{i}{sigma }_{i}).
In the Introduction and Results, we characterized SAE as relating to statistical mechanisms, but Eq. (4) shows that the strength of SAE depends on the evenness of species variances, which may be influenced by species biology. In fact, ({SAE}=frac{{S}_{{{{{{rm{com}}}}}}_{{{{{rm{ip}}}}}}}}{{S}_{{{{{{rm{pop}}}}}}}}) is a comparison between the two values that community stability would take in the scenarios of independent populations (({S}_{{{{{{rm{com}}}}}}_{{{{{rm{ip}}}}}}})) and entirely synchronous populations (({S}_{{{{{{rm{pop}}}}}}};) see elsewhere14 for a proof that ({S}_{{{{{{rm{pop}}}}}}}) is the value community stability would take if populations were perfectly synchronous), and so may be best characterized as a statistical effect, the strength of which depends on evenness of variances (and therefore on biology). We define ({{SAE}}_{{{{{{rm{even}}}}}}}) to be the value SAE would take in the case of perfect evenness of species variances, i.e., if all species variances were identical (that means σ1 = σ2 = … = σ). We can then define the evenness effect, ({EVN}=frac{{SAE}}{{{SAE}}_{{{{{{rm{even}}}}}}}}), so that the equation ({SAE}={EVN}times {{SAE}}_{{{{{{rm{even}}}}}}}) separates out biological/evenness effects from purely statistical effects. ({{SAE}}_{{{{{{rm{even}}}}}}}) is a purely statistical version of statistical averaging because it only captures the stabilizing effects of averaging independent random variables, being uninfluenced by evenness. It can straightforwardly be shown that ({{SAE}}_{{{{{{rm{even}}}}}}}=sqrt{S}): applying Eq. (4) to the scenario of equal population variances, one sees that ({{SAE}}_{{{{{{rm{even}}}}}}}=sqrt{frac{1}{{{sum }_{i}left(1/Sright)}^{2}}}=sqrt{S}). This is the maximum possible value of SAE for a given species richness, S.
It is possible to explicitly quantify the effects of the evenness of variability. Using the formula ({{log }}left({S}_{{{{{{rm{com}}}}}}}right)={{log }}left({SAE}right)+{{log }}left({CPE}right)+{{log }}({S}_{{{{{{rm{pop}}}}}}}),) the main observation of our study has been that regression slopes of ({{log }}left({SAE}right)) against ({{log }}left(Sright)) tend to have substantially positive slopes, whereas regression slopes of ({{log }}left({CPE}right)) and ({{log }}({S}_{{{{{{rm{pop}}}}}}})) against ({{log }}left(Sright)) tend to have negative or only slightly positive slopes; and hence regression slopes of ({{log }}left({S}_{{com}}right)), which are sums of the other slopes, tend to be positive primarily because of the positive relationship between log(SAE) and log(S). But, in another decomposition, we can also write ({{log }}left({S}_{{com}}right)={{log }}left({{SAE}}_{{{{{{rm{even}}}}}}}right)+{{log }}left({ENV}right)+{{log }}left({CPE}right)+{{log }}({S}_{{{{{{rm{pop}}}}}}})=frac{1}{2}{{log }}left(Sright)+{{log }}left({ENV}right)+{{log }}left({CPE}right)+{{log }}({S}_{{{{{{rm{pop}}}}}}})). We can observe that, because all our regressions of ({{log }}left({SAE}right)) against ({{log }}left(Sright)) had slope less than 1/2 (Figs. 2e and 4e), regressions of ({{log }}left({EVN}right)) versus ({{log }}left(Sright)) will have negative slope for our empirical systems. Therefore ({{log }}left({S}_{{com}}right)) tends to depend on ({{log }}left(Sright)) with a “baseline” slope of 1/2, for purely statistical reasons (the ({{log }}left({{SAE}}_{{{{{{rm{even}}}}}}}right)) effect), with effects of evenness (({{log }}left({ENV}right))), compensatory dynamics (log(CPE)) and population variability (log(Spop)) tending to reduce this slope. This again supports the conclusion that statistical averaging is the main reason for positive diversity-stability relationships in our data, now using a definition of statistical averaging that is purely statistical. We used SAE instead of ({{SAE}}_{{{{{{rm{even}}}}}}}) in Results to represent the concept of statistical averaging because it is a quantification of ideas which were previously present in the literature20, although it can be viewed instead as a combination of statistical averaging effects and effects of evenness of species variances.
Note that Loreau and de Mazancourt14 got synchrony as 1/S for the special case of identical ({sigma }_{i}) and zero-correlation, a value which seems to contrast with our result, (sqrt{S}), described above. But Loreau and de Mazancourt14 measured community-wide synchrony via a statistic (theta=frac{{sigma }_{T}^{2}}{{left({sum }_{i}{sigma }_{i}right)}^{2}}). We can make further transformation: (theta=frac{{({sigma }_{T}/{mu }_{{tot}})}^{2}}{{left({sum }_{i}{sigma }_{i}/{mu }_{{tot}}right)}^{2}}=frac{{{CV}}_{{com}}^{2}}{{left({sum }_{i}frac{{mu }_{i}}{{mu }_{{tot}}}bullet frac{{sigma }_{i}}{{mu }_{i}}right)}^{2}}=frac{{{CV}}_{{com}}^{2}}{{{CV}}_{{pop}}^{2}}). Here CVpop2 means the square of the weighted average of CVi. That is, θ measures a ratio of community-level CV2 to population level CV2. But in our system, our asynchrony measures a ratio of community-level stability (1/CV) to population-level stability. These differences in notational and terminological choices explain why we got (sqrt{S}) but Loreau and de Mazancourt14 got 1/S in this special case – our results are actually consistent with those of Loreau and de Mazancourt when notational choices are accounted for.
Further decomposition of CPE
In previous sections, we decomposed asynchrony into a statistical averaging effect (SAE) and a compensatory effect (CPE) via constructing a surrogate or hypothetical community stability value Scom_ip by quantifying what community stability would be if all species were to fluctuate independently and thus species covariances equal zero. By comparing the stability of the real community, Scom, to the hypothetical community stability under the assumption of species independence, Scom_ip, we then computed CPE as Scom/Scom_ip. But compensatory dynamics among species in a community can arise from at least two distinct mechanisms: (1) differential response of species to environmental perturbations, and (2) direct or indirect interactions among species19. Correspondingly, we can decompose CPE into two parts: compensatory effects due to differential responses to environmental perturbations (CPEenv) and compensatory effects due to species interactions (CPEint). We here initiate that decomposition.
To separate CPEenv from CPEint, we construct another surrogate/hypothetical community stability value, called ({S}_{{{{{{rm{com}}}}}}_{{{{{rm{sur}}}}}}}), for which the effects on community stability of components of species covariances due to species interactions are eliminated, keeping only components due to common species responses to environmental perturbations. In the following section “Surrogate community stability from different plots”, we establish surrogate community stability values for the systems of the plant survey dataset by means of time series from other plots from the same system. Time series from distinct plots may covary, but if they do, it is likely because of common environmental influences rather than species interactions. In the section “Surrogate community stability from monocultures”, we establish surrogate community stability values for the systems of the biodiversity-manipulated experimental dataset by means of monocultures which were established alongside the multi-species plots. Monocultures in separate plots may covary, but if they do, it is likely again because of common environmental influences rather than species interactions. Both approaches lead to approximations of the value community stability would take if species interactions were eliminated, and therefore ({S}_{{{{{{rm{com}}}}}}_{{{{{rm{sur}}}}}}}={{{CPE}}_{{{{{{rm{env}}}}}}}times {SAE}times S}_{{{{{{rm{pop}}}}}}}={{{CPE}}_{{{{{{rm{env}}}}}}}times S}_{{{com}}_{{{{rm{ip}}}}}}). Thus, we calculate CPEenv by
$${{CPE}}_{{{{{{rm{env}}}}}}}=frac{{S}_{{{{{{rm{com}}}}}}_{{{{{rm{sur}}}}}}}}{{S}_{{{{{{rm{com}}}}}}_{{{{{rm{ip}}}}}}}}.$$
(5)
The compensatory effect from species interactions is then the rest of CPE, i.e.
$${{CPE}}_{{{{{{rm{int}}}}}}}=frac{{CPE}}{{{CPE}}_{{env}}}=frac{{S}_{{com}}/{S}_{{{com}}_{{ip}}}}{{S}_{{com}_{sur}}/{S}_{{com}_{ip}}}=frac{{S}_{{{{{{rm{com}}}}}}}}{{S}_{{{{{{rm{com}}}}}}_{{{{{rm{sur}}}}}}}}.$$
(6)
Both CPEenv and CPEint can be either greater than or less than 1, corresponding to the fact that joint species responses to environmental perturbations (CPEenv) and species interactions (CPEint) can be either stabilizing or destabilizing. Thus, we have decomposed community stability, Scom, into four parts (Supplementary Fig. 5): population stability, Spop; the statistical averaging effect, SAE; compensatory effects due to joint species responses to environmental perturbations, CPEenv; and compensatory effects due to species interactions, CPEint.
In the Introduction and Results, the term “portfolio effect” was used for the stabilizing effect of biodiversity from independent fluctuations (zero correlation) of species abundances through time20, measured by SAE. However, other studies have adopted an alternative definition of portfolio effects, allowing for non-zero correlations between species induced by common responses to the environment, but excluding components of correlations resulting from species interactions12,19,21. In our framework, this alternative definition of portfolio effects could be represented by SAE × CPEenv (Supplementary Fig. 5). We prefer to group CPEenv together with CPEint rather than with SAE, because SAE corresponds to primarily statistical mechanisms, whereas CPEenv involves species responses to the environment, which we regard as containing biological information. But other interpretations and classifications can certainly be reasonable, as demonstrated by earlier studies which defined portfolio effects in a way that corresponds to our product SAE × CPEenv12,19,21. In the following sections, we decompose diversity-stability relationships into components that come from the relationship of each of the four terms of Supplementary Fig. 5 to diversity.
Surrogate community stability from different plots
In this section, we explain how we construct surrogate communities that apply to observational data without monoculture data, such as our plant survey dataset. In doing so, we calculate an approximate hypothetical/surrogate community stability index, Scom_sur, using populations from different mixture plots. The general idea is to construct a surrogate covariance matrix,
$${{{{{rm{C}}}}}}=left(begin{array}{ccc}{V}_{11} & cdots & {C}_{1S} vdots & ddots & vdots {C}_{S1} & cdots & {V}_{{SS}}end{array}right),$$
which will be used to calculate Scom_sur. Again using xi(t) to denote the biomass of species i = 1, …, S at time t = 1, …, t in a focal plot, the ith diagonal element of the surrogate covariance matrix, Vii, is the variance of xi(t). The off-diagonal element Cij equals the covariance of xi and Zj, where ({Z}_{j}left(tright)=frac{{z}_{j}left(tright)-{bar{z}}_{j}}{{{{{{rm{sd}}}}}}left({z}_{j}right)}times {{{{{rm{sd}}}}}}left({x}_{j}right)+{bar{x}}_{j}) is a surrogate time series, with zj representing a time series of species j taken from another plot. When more than one other plot contained species j, we randomly chose one. The time series ({Z}_{j}left(tright)) was constructed to have the same mean and variance as xj(t), but its correlations with xj(t) reflects correlations between two non-interacting populations of species i and j from distinct plots. We then calculated ({S}_{{{{{{rm{com}}}}}}_{{{{{rm{sur}}}}}}}=frac{{bar{X}}_{{{{{{rm{tot}}}}}}}}{sqrt{mathop{sum}limits_{{ij}}{C}_{{ij}}}}), and we calculated CPEenv and CPEint from Eqs. (5) and (6). This method was applied only to the plant survey dataset, where plots at a site within the plant survey dataset were considered replicates. In the biodiversity-manipulated experimental dataset, different plots of the same diversity had different artificially maintained species composition.
Surrogate community stability from monocultures
In this section, we explain how we construct surrogate communities that apply to a biodiversity experiment with monoculture data. Given a community consisting of species i = 1, …, S, recall that xi(t) denotes the biomass of species i = 1, …, S at time t = 1, …, t. Suppose monocultures, yi(t), were maintained for each species over the same time period. When more than one monoculture was maintained for a species, we randomly selected one. The hypothetical/surrogate community stability value ({S}_{{{{{{rm{com}}}}}}_{{{{{rm{sur}}}}}}}) was computed as ({S}_{{{{{{rm{com}}}}}}_{{{{{rm{sur}}}}}}}=frac{{bar{Y}}_{{{{{{rm{tot}}}}}}}}{{{{{{rm{sd}}}}}}left({Y}_{{{{{{rm{tot}}}}}}}right)}), where ({Y}_{{{{{rm{tot}}}}}}left(tright)={sum }_{i}{Y}_{i}(t)) and ({Y}_{i}left(tright)=frac{{y}_{i}left(tright)-{bar{y}}_{i}}{{{{{{rm{sd}}}}}}left({y}_{i}right)}times {{{{{rm{sd}}}}}}left({x}_{i}right)+{bar{x}}_{i}). The time series ({Y}_{i}left(tright)) were constructed to have the same mean and variance as the time series xi(t), but with correlations equal to those between the corresponding monocultures. CPEenv and CPEint were then calculated from Eqs. (5) and (6). We repeated this process 100 times, randomly selecting from among available monocultures for each species on each repeat calculation, and taking means to get CPEenv and CPEint. This method was only used for the biodiversity-manipulated experimental dataset because monocultures were not available for the plant survey dataset.
It is worth noting that our construction of the surrogate community assumes that in the absence of species interactions, species in the mixture should exhibit the same correlation as those in monocultures. While this assumption may hold if population dynamics were solely influenced by environmental stochasticity, the influence of demographic stochasticity may lead to a lower correlation in the mixture than in monocultures due to the relatively lower population size in the mixture. In other words, our surrogate community might overestimate the correlation between species and thus underestimate ({S}_{{{{{{rm{com}}}}}}_{{{{{rm{sur}}}}}}}) (i.e., the expected stability of a community without interactions). Moreover, the possible underestimation of ({S}_{{{{{{rm{com}}}}}}_{{{{{rm{sur}}}}}}}) could be more pronounced for more diverse communities (which have substantially lower population sizes in mixtures than monocultures), thus possibly creating an artificially negative correlation between species richness and CPEenv ((={S}_{{{{{{rm{com}}}}}}_{{{{{rm{sur}}}}}}}/{S}_{{{{{{rm{com}}}}}}_{{{{{rm{ip}}}}}}})), as well as a positive correlation between species richness and CPEint ((={S}_{{{{{{rm{com}}}}}}}/{S}_{{com}_{sur}})). So, the patterns in Supplementary Fig. 7a, b might be partially explained by such artificial effects. However, how to separate out such artificial effects from other processes was by no means obvious30. We note, importantly, that such effects, even if they exist, should not affect our main conclusion. Indeed, the positive correlation between species richness and SAE × CPEenv (Supplementary Fig. 7d) would be even stronger if such effects existed and were accounted for.
Statistical analysis
For the plant survey dataset, the values of each of our theoretical quantities were averaged across all plots for each site and then relationships between site averages were assessed with regression. For the biodiversity-manipulated experimental dataset, theoretical quantities were calculated for individual plots because species richness was manipulated at the plot level. All theoretical quantities were log10 transformed prior to generating plots and preforming regressions, except for the Shannon index, because the Shannon index has a log transformation already embedded in its definition. All analyses were programmed in R 4.0.537.
To compute the confidence interval of CPE, we constructed bootstrapped species time series. For each plot, separately, the AAFT procedure38, implemented in the surrog function in the R package wsyn, was used to generate bootstrapped species time series for the plot, for which time series were rendered uncorrelated while retaining the autocorrelation and marginal-distribution properties of the original data. Such a bootstrapped dataset instantiates the null hypothesis that species were unrelated while retaining other statistical properties of the data not related to species relationships. For each plot, 1000 bootstrapped datasets were computed, giving rise to 1000 bootstrapped quantities Scom_ip(i) for i = 1, …, 1000. We then computed the quantities SAE(i) = Scom_ip(i)/Spop and CPE(i) = Scom/(SAE(i)Spop). Confidence intervals for CPE were computed using quantiles of the distribution CPE(i).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Source: Ecology - nature.com