More stories

  • in

    Rank-invariant estimation of inbreeding coefficients

    Statistical samplingWe can describe the dependence between pairs of uniting alleles in a single population without invoking an evolutionary model for the history of the population. In this “statistical sampling” framework (Weir, 1996) we do not consider the variation associated with evolutionary processes but we do consider the variation among samples from the same population. Although extensive sets of genetic data allow individual-level inbreeding coefficients to be estimated with high precision, we start with population-level estimation.Allelic dependencies can be quantified with the within-population inbreeding coefficient, written here as fW to emphasize it is a within-population quantity, defined by$${H}_{l}=2{p}_{l}(1-{p}_{l})(1-{f}_{W})$$
    (1)
    where Hl is the population proportion of heterozygotes for the reference allele at SNP l and pl is the population proportion of that allele. The same value of fW is assumed to apply for all SNPs. An immediate consequence of this definition is that the population proportions of homozygotes for the reference and alternative alleles are ({p}_{l}^{2}+{p}_{l}(1-{p}_{l}){f}_{W}) and ({(1-{p}_{l})}^{2}+{p}_{l}(1-{p}_{l}){f}_{W}) respectively. This formulation allows fW to be negative, with the maximum of −pl/(1 − pl) and −(1 − pl)/pl as lower bound. It is bounded above by 1. Hardy–Weinberg equilibrium, HWE, corresponds to fW = 0 and textbooks (e.g., (Hedrick, 2000)) point out that negative values of fW indicate more heterozygotes than expected under HWE.Observed heterozygote proportions ({tilde{H}}_{l}) have Hl as within-population expectation ({{{{{{mathcal{E}}}}}}}_{W}) over samples from the study population, ({{{{{{mathcal{E}}}}}}}_{W}({tilde{H}}_{l})={H}_{l}), and this would provide a simple estimator of fW if the population allele proportions were known. In practice, however, these proportions are unknown. Steele et al. (2014) suggested use of data external to the study sample to provide reference allele proportions in forensic applications where a reference database is used for making inferences about the population relevant for a particular crime. The more usual approach is to use study sample proportions ({tilde{p}}_{l}) in place of the true proportions pl, as in equation 1 of Li & Horvitz (1953):$${hat{f}}_{{W}_{l}}=1-frac{{tilde{H}}_{l}}{2{tilde{p}}_{l}(1-{tilde{p}}_{l})}$$
    (2)
    The moment estimator in Eq. (2) is also an MLE of fW when only one locus is considered, but it is biased (Robertson & Hill, 1984) since not only is it a ratio of statistics but also the expected value ({{{{{{mathcal{E}}}}}}}_{W}[2{tilde{p}}_{l}(1-{tilde{p}}_{l})]) over repeated samples of n from the population is 2pl(1 − pl)[1 − (1 + fW)/(2n)] (e.g., (Weir, 1996), p39).This approach can be used to estimate the within-population inbreeding coefficient fj for each individual j in a sample from one population. These are the “simple” estimators of Hall et al. (2012) and the ({hat{f}}_{{{{{{{rm{HOM}}}}}}}_{j}}) of Yengo et al. (2017):$${hat{f}}_{{{{{{{rm{HOM}}}}}}}_{jl}}=1-frac{{tilde{H}}_{jl}}{2{tilde{p}}_{l}(1-{tilde{p}}_{l})}$$
    (3)
    The sample heterozygosity indicator ({tilde{H}}_{jl}) is one if individual j is heterozygous at SNP l and is zero otherwise. Averaging Eq. (3) over individuals gives the estimator based on SNP l in Eq. (2).A single SNP provides estimates that are either 1 or a negative value depending on ({tilde{p}}_{l}), so many SNPs are used in practice. In both Hall et al. (2012) and Yengo et al. (2017) data were combined over loci as weighted or “ratio of averages” estimators:$${hat{f}}_{{{{{{{rm{Hom}}}}}}}_{j}}=1-frac{{sum }_{l}({tilde{H}}_{jl})}{{sum }_{l}[2{tilde{p}}_{l}(1-{tilde{p}}_{l})]}$$
    (4)
    Gazal et al. (2014) referred to this estimator as fPLINK as it is an option in PLINK. We show below the good performance of this weighted estimator for large sample sizes and large numbers of loci. We will consider throughout that a large number L of SNPs are used so that ratios of sums of statistics over loci, such as in Eq. (4), have expected values equal to the ratio of expected values of their numerators and denominators. Ochoa & Storey (2021) showed statistics of the form ({tilde{A}}_{L}/{tilde{B}}_{L}), where ({tilde{A}}_{L}=mathop{sum }nolimits_{l = 1}^{L}{a}_{l}/L) and ({tilde{B}}_{L}=mathop{sum }nolimits_{l = 1}^{L}{b}_{l}/L), have expected values that converge almost surely to the ratio A/B when ({{{{{{mathcal{E}}}}}}}_{W}({tilde{A}}_{L})=A{c}_{L}) and ({{{{{{mathcal{E}}}}}}}_{W}({tilde{B}}_{L})=B{c}_{L}). This result rests on the expectations ({{{{{{mathcal{E}}}}}}}_{W}({a}_{l})=A{c}_{l}) and ({{{{{{mathcal{E}}}}}}}_{W}({b}_{l})=B{c}_{l}) with ({c}_{L}=mathop{sum }nolimits_{l = 1}^{L}{c}_{l}/L). It requires ∣al∣, ∣bl∣ to both be no greater than some finite quantity C, cL to converge to a finite value c as L increases, and for Bc not to be zero. For the ratio in Eq. (4), ({a}_{l}={tilde{H}}_{jl}), ({b}_{l}=2{tilde{p}}_{l}(1-{tilde{p}}_{l})) so A = (1 − fj), B = 1 for large sample sizes n, and cL = ∑l2pl(1 − pl)/L ≤ 1/2. The conditions are satisfied providing at least one SNP is polymorphic. For an “average of ratios” estimator of the form (mathop{sum }nolimits_{l = 1}^{L}({a}_{l}/{b}_{l})/L), the denominators bl can be very small and convergence of its expected value is not assured.As an alternative to using sample allele frequencies, Hall et al. (2012) used maximum likelihood to estimate population allele proportions for multiple loci whereas Ayres & Balding (1998) used Markov chain Monte Carlo methods in a Bayesian approach that integrated out the allele proportion parameters. Neither of those papers considered data of the size we now face in sequence-based studies of many organisms, and we doubt the computational effort to estimate, or integrate over, hundreds of millions of allele proportions in Eqs. (2) or (4) adds much value to inferences about f. The allele-sharing estimators we describe below regard allele probabilities as unknown nuisance parameters and we show how to avoid estimating them or assigning them values.Hall et al. (2012) used an EM algorithm to find MLEs for fj when population allele proportions were regarded as being known and equal to sample proportions. Alternatively, a grid search can be conducted over the range of validity for the single parameter fj that maximizes the log-likelihood$${{{{mathrm{ln}}}}},[{{{{{rm{Lik}}}}}}({f}_{j})]={{{{{rm{Constant}}}}}}+mathop{sum }limits_{l=1}^{L}{{tilde{H}}_{jl}{{{{mathrm{ln}}}}},[(1-{f}_{j})]+(1-{tilde{H}}_{jl}){{{{mathrm{ln}}}}},[1-2{tilde{p}}_{l}(1-{tilde{p}}_{l})(1-{f}_{j})]}$$Estimation of the within-population inbreeding coefficients fW (FIS of (Wright, 1922)) and fj does not require any information beyond genotype proportions in samples from a study population, nor does it make any assumptions about that population or the evolutionary forces that shaped the population. The coefficients are simply measures of dependence of pairs of alleles within individuals.Genetic samplingInbreeding parameters of most interest in genetic studies are those that recognize the contribution of previous generations to inbreeding in the present study population. This requires accounting for “genetic sampling” (Weir, 1996) between generations, thereby leading to an ibd interpretation of inbreeding: ibd alleles descend from a single allele in a reference population. It also allows the prediction of inbreeding coefficients by path counting when pedigrees are known (Wright, 1922). If individual J is ancestral to both individuals (j^{prime}) and j″, and if there are n individuals in the pedigree path joining (j^{prime}) to j″ through J, then Fj = ∑(0.5)n(1 + FJ) where FJ is the inbreeding coefficient of ancestor J and Fj is the inbreeding coefficient of offspring j of parents (j^{prime}) and j″. The sum is over all ancestors J and all paths joining (j^{prime}) to j″ through J. The expression is also the coancestry ({theta }_{j^{prime} j^{primeprime} }) of (j^{prime}) and j″: the probability an allele drawn randomly from (j^{prime}) is ibd to an allele drawn randomly from j″.The allele proportion pl in a study population has expectation πl over evolutionary replicates of the population from an ancestral reference population to the present time. Sample allele proportions ({tilde{p}}_{l}) provide information about the population proportions pl, and their statistical sampling properties follow from the binomial distribution. We do not invoke a specific genetic sampling distribution for the pl about their expectations πl although we do assume the second moments of that distribution depend on probabilities of ibd for pairs of alleles. One consequence of the assumed moments is that the probability of individual j in the study sample being heterozygous, i.e., the total expected value ({{{{{{mathcal{E}}}}}}}_{T}) of the heterozygosity indicator over replicates of the history of that individual, is$${{{{{{mathcal{E}}}}}}}_{T}({tilde{H}}_{{j}_{l}})=2{pi }_{l}(1-{pi }_{l})(1-{F}_{j})$$
    (5)
    The quantity Fj is the individual-specific version of FIT of Wright (1922) and we can regard it as the probability the two alleles at any locus for individual j are ibd. There is an implicit assumption in Eq. (5) that the reference population needed to define ibd is infinite and in HWE: there is probability Fj that j has homologous alleles with a single ancestral allele in that population and probability (1 − Fj) of j having homologous alleles with distinct ancestral alleles there. In the first place, the single ancestral allele has probability π of being the reference allele for that locus and the implicit assumption is that two ancestral alleles are both the reference type with probability π2. This does not mean there is an actual ancestral population with those properties, any more than use of ({{{{{{mathcal{E}}}}}}}_{T}) means there are actual replicates of the history of any population or individual, and we note that Eq. (5) does not allow higher heterozygosity than predicted by HWE. Nonetheless, the concept of ibd allows theoretical constructions of great utility and we now present a framework for approaching empirical situations.Inbreeding, or ibd, implies a common ancestral origin for uniting alleles and statements about sample allele proportions ({tilde{p}}_{l}) require consideration of possible ibd for other pairs of alleles in the sample. The total expectation of (2{tilde{p}}_{l}(1-{tilde{p}}_{l})) over samples from the population and over evolutionary replicates of the study population is ((Weir, 1996), p176)$${{{{{{mathcal{E}}}}}}}_{T}[2{tilde{p}}_{l}(1-{tilde{p}}_{l})]=2{pi }_{l}(1-{pi }_{l})left[(1-{theta }_{S})-frac{1}{2n}left(1+{F}_{W}-2{theta }_{S}right)right]$$
    (6)
    where FW is the parametric inbreeding coefficient averaged over sample members, ({F}_{W}=mathop{sum }nolimits_{j = 1}^{n}{F}_{j}/n), and θS is the average parametric coancestry in the sample, ({theta }_{S}=mathop{sum }nolimits_{j = 1}^{n}{sum }_{j^{prime} ne j}{theta }_{jj^{prime} }/[n(n-1)]). Equivalent expressions were given by McPeek et al. (2004) and DeGiorgio and Rosenberg (2009). We note the relationship fW = (FW − θS)/(1 − θS) given by Wright (1922) and we showed in WG17 the equivalent expression fj = (Fj − θS)/(1 − θS) for individual-specific values (θS is Wright’s FST).For a large number of SNPs, the expectation of a ratio estimator of the type considered here is the ratio of expectations (Ochoa & Storey, 2021). Therefore, the total expectations of the ({hat{f}}_{{{{{{{rm{Hom}}}}}}}_{j}}), taking into account both statistical and genetic sampling, are$${{{{{{mathcal{E}}}}}}}_{T}({hat{f}}_{{{{{{{rm{HOM}}}}}}}_{j}})=1-frac{1-{F}_{j}}{(1-{theta }_{S})-frac{1}{2n}left(1+{F}_{W}-2{theta }_{S}right)}=frac{{f}_{j}-frac{1}{2n}(1+{f}_{W})}{1-frac{1}{2n}(1+{f}_{W})}$$
    (7)
    For all sample sizes, ({hat{f}}_{{{{{{{rm{HOM}}}}}}}_{j}}) has an expected value less than the true value fj, with the bias being of the order of 1/n. The ranking of ({{{{{{mathcal{E}}}}}}}_{T}({hat{f}}_{{{{{{{rm{HOM}}}}}}}_{j}})) values, however, is the same as the ranking of the fj and, therefore, of the Fj. For large sample sizes, Eq. (7) reduces to ({{{{{{mathcal{E}}}}}}}_{T}({hat{f}}_{{{{{{{rm{HOM}}}}}}}_{j}})={f}_{j}). Averaging over individuals shows that ({{{{{{mathcal{E}}}}}}}_{T}({hat{f}}_{{{{{{rm{HOM}}}}}}})={f}_{W}): the population-level estimator in Eq. (2) has total expectation of fW, not FW.A different outcome is found for the ({hat{f}}_{{{{{{{rm{UNI}}}}}}}_{j}}) estimator of Yengo et al. (2017) (i.e., ({hat{f}}^{III}) of Yang et al. (2011); ({hat{f}}_{{{{{{rm{GCTA}}}}}}3}) of (Gazal et al., 2014)). This estimator, with the weighted (w) ratio of averages over loci we recommend, as opposed to the unweighted (u) average of ratios over loci used in their papers, is$${hat{f}}_{{{{{{{rm{UNI}}}}}}}_{j}}^{w}=frac{mathop{sum }nolimits_{l = 1}^{L}[{X}_{jl}^{2}-(1+2{tilde{p}}_{l}){X}_{jl}+2{tilde{p}}_{l}^{2}]}{mathop{sum }nolimits_{l = 1}^{L}2{tilde{p}}_{l}(1-{tilde{p}}_{l})}$$
    (8)
    In this equation Xjl is the reference allele dosage, the number of copies of the reference allele, at SNP l for individual j. It is equivalent to the estimator given by (Ritland (1996), eq. 5) and attributed by him to Li & Horvitz (1953).Ochoa & Storey (2021) showed that ({hat{f}}_{{{{{{{rm{UNI}}}}}}}_{j}}^{w}) has expectation, for a large number of SNPs and a large sample size, of$${{{{{{mathcal{E}}}}}}}_{T}({hat{f}}_{{{{{{{rm{UNI}}}}}}}_{j}}^{w})=frac{{F}_{j}-2{{{Psi }}}_{j}+{theta }_{S}}{1-{theta }_{S}}={f}_{j}-2{psi }_{j}$$
    (9)
    where Ψj is the average coancestry of individual j with other members of the study sample: ({{{Psi }}}_{j}=mathop{sum }nolimits_{j^{prime} = 1,j^{prime} ne j}^{n}{theta }_{jj^{prime} }/(n-1)). We term ψj = (Ψj − θS)/(1 − θS) the within-population individual-specific average kinship coefficient. The Ψj have an average of θS over members of the sample, so the average of the ψj’s is zero and expected value of the average of the ({hat{f}}_{{{{{{{rm{UNI}}}}}}}_{j}}^{w}) is fW, as is the case for ({hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}}) below.Equation (9) shows that the ({hat{f}}_{{{{{{{rm{UNI}}}}}}}_{j}}^{w}) have expected values with the same ranking as the Fj values only if every individual j in the sample has the same average kinship ψj with other sample members.Finally, we mention another common estimator described by VanRaden (2008), termed fGCTA1 by Gazal et al. (2014) and available from the GCTA software (Yang et al., 2011) with option –ibc. We referred to this as the “standard” estimator in WG17. The weighted version for multiple loci is$${hat{f}}_{{{{{{{rm{STD}}}}}}}_{j}}^{w}=frac{{sum }_{l}{({X}_{jl}-2{tilde{p}}_{l})}^{2}}{{sum }_{l}2{tilde{p}}_{l}(1-{tilde{p}}_{l})}-1$$
    (10)
    and it has the large-sample expectation of (fj − 4ψj) as is implied by WG17 (Eq. 13) and as was given by Ochoa & Storey (2021). We summarize the various measures of inbreeding and coancestry in Table 1, and we include sample sizes in the expectations shown in Table 2.Table 1 Measures of inbreeding and coancestry.Full size tableTable 2 Estimators of inbreeding.Full size tableThe ({hat{f}}_{{{{{{rm{HOM}}}}}}}), ({hat{f}}_{{{{{{rm{UNI}}}}}}},{hat{f}}_{{{{{{rm{STD}}}}}}}) and ({hat{f}}_{{{{{{rm{MLE}}}}}}}) estimators of individual or population inbreeding coefficients make explicit use of sample allele proportions. This means that all four have small-sample biases, and none of the four provide estimates of the ibd quantities F or Fj. We showed that ({hat{f}}_{{{{{{rm{HOM}}}}}}}) is actually estimating the within-population inbreeding coefficients: the total inbreeding coefficients relative to the average coancestry of pairs of individuals in the sample, but ({hat{f}}_{{{{{{rm{UNI}}}}}}}) and ({hat{f}}_{{{{{{rm{STD}}}}}}}) are estimating expressions that also involve average kinships ψ.Allele sharingIn a genetic sampling framework, and with the ibd viewpoint, we consider within-individual allele sharing proportions Ajl for SNP l in individual j (we wrote M rather than A in WG17 and in (Goudet et al., 2018)). These equal one for homozygotes and zero for heterozygotes and sample values can be expressed in terms of allele dosages, ({tilde{A}}_{jl}={({X}_{jl}-1)}^{2}). We also consider between-individual sharing proportions ({A}_{jj^{prime} l}) for SNP l and individuals j and (j^{prime}). These are equal to one for both individuals being the same homozygote, zero for different homozygotes, and 0.5 otherwise. Observed values can be written as ({tilde{A}}_{jj^{prime} l}=[1+({X}_{jl}-1)({X}_{j^{prime} l}-1)]/2), with an average over all pairs of distinct individuals in a sample of ({tilde{A}}_{Sl}). Astle & Balding (2009) introduced ({tilde{A}}_{jj^{prime} l}) as a measure of identity in state of alleles chosen randomly from individuals j and (j^{prime}), and Ochoa & Storey (2021) used a simple transformation of this quantity. The allele sharing for an individual with itself is Ajjl = (1 + Ajl)/2.The same logic that led to Eq. (5) provides total expectations for allele-sharing proportions for all (j,j^{prime}):$$begin{array}{lll}{{{{{{mathcal{E}}}}}}}_{T}({tilde{A}}_{jj^{prime} l})&=&1-2{pi }_{l}(1-{pi }_{l})(1-{theta }_{jj^{prime} })\ {{{{{{mathcal{E}}}}}}}_{T}({tilde{A}}_{Sl})&=&1-2{pi }_{l}(1-{pi }_{l})(1-{theta }_{S})end{array}$$Note that θjj = (1 + Fj)/2. The nuisance parameter 2πl(1 − πl) cancels out of the ratio ({{{{{{mathcal{E}}}}}}}_{T}({tilde{A}}_{jj^{prime} l}-{tilde{A}}_{Sl})/{{{{{{mathcal{E}}}}}}}_{T}(1-{tilde{A}}_{Sl})) and this motivates definitions of allele-sharing estimators of the inbreeding coefficient for individual j and the kinship coefficient for individuals (j,j^{prime}) as$${hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}}=frac{{sum }_{l}({tilde{A}}_{{j}_{l}}-{tilde{A}}_{{S}_{l}})}{{sum }_{l}(1-{tilde{A}}_{Sl})},{hat{psi }}_{{{{{{{rm{AS}}}}}}}_{jj^{prime} }}=frac{{sum }_{l}({tilde{A}}_{jj^{prime} l}-{tilde{A}}_{{S}_{l}})}{{sum }_{l}(1-{tilde{A}}_{Sl})}$$
    (11)
    For a large number of SNPs, these are unbiased for fj and ({psi }_{jj^{prime} }) for all sample sizes. We showed in WG17 there is no need to filter on minor allele frequency to preserve the lack of bias. Note that ({hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}}) is a linear function of the form ({a}_{S}+{b}_{S}{tilde{A}}_{j}) with ({tilde{A}}_{j}) being the total homozygosity for j and constants aS, bS being the same for all individuals j. Changing the scope of the study, from population to world for example, preserves linearity (with different values of aS, bS). The changed estimates are linear functions of the old estimates: old and new estimates are completely correlated and are rank invariant over all samples that include particular individuals, i.e., over all reference populations. Unlike the case for ({hat{f}}_{{{{{{rm{UNI}}}}}}}) or ({hat{f}}_{{{{{{rm{STD}}}}}}}), rank invariance is guaranteed for ({hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}}) for any two individuals even if only one more individual is added to the study.For large sample sizes, ((1-{tilde{A}}_{Sl})approx 2{tilde{p}}_{l}(1-{tilde{p}}_{l})). Under that approximation, ({hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}}) is the same as ({hat{f}}_{{{{{{{rm{Hom}}}}}}}_{j}}) but the approximation is not necessary in computer-based analyses. Summing the large-sample estimates over individuals not equal to j gives an estimator for the average individual kinship ψj:$${hat{psi }}_{{{{{{{rm{AS}}}}}}}_{j}}=-frac{{sum }_{l}({X}_{jl}-2{tilde{p}}_{l})(1-2{tilde{p}}_{l})}{{sum }_{l}4{tilde{p}}_{l}(1-{tilde{p}}_{l})}$$
    (12)
    Adding (2{hat{psi }}_{{{{{{{rm{AS}}}}}}}_{j}}) to ({hat{f}}_{{{{{{{rm{UNI}}}}}}}_{j}}^{w}) gives ({hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}}), as expected, as does adding (4{hat{psi }}_{{{{{{{rm{AS}}}}}}}_{j}}) to ({hat{f}}_{{{{{{{rm{STD}}}}}}}_{j}}^{w}). Similarly, ({hat{psi }}_{{{{{{{rm{AS}}}}}}}_{jj^{prime} }}) is obtained by adding ({hat{psi }}_{{{{{{{rm{AS}}}}}}}_{j}}) and ({hat{psi }}_{{{{{{{rm{AS}}}}}}}_{j^{prime} }}) to ({hat{psi }}_{{{{{{{rm{STD}}}}}}}_{jj^{prime} }}), where (Yang et al., 2011)$${hat{psi }}_{{{{{{{rm{STD}}}}}}}_{jj^{prime} }}=frac{mathop{sum}nolimits_{l}({X}_{jl}-2{tilde{p}}_{l})({X}_{j^{prime} l}-2{tilde{p}}_{l})}{mathop{sum}nolimits_{l}4{tilde{p}}_{l}(1-{tilde{p}}_{l})}$$These are the elements of the first method for constructing the GRM given by VanRaden (2008).When inbreeding and coancestry coefficients are defined as ibd probabilities they are non-negative, but the within-population values f and ψ will be negative for individuals, or pairs of individuals, having smaller ibd allele probabilities than do pairs of individuals in the sample, on average. Individual-specific values of f always have the same ranking as the individual-specific F values, and they are estimable. Negative estimates can be avoided by the transformation to (({hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}}-{hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}}^{min })/(1-{hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}}^{min })) where ({hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}}^{min }) is the smallest value over individuals of the ({hat{f}}_{{{{{{{rm{AS}}}}}}}_{j}})’s. We don’t see the need for this transformation, and we noted above the recognition of the utility of negative values. Ochoa & Storey (2021) wished to estimate Fj rather than fj and, to overcome the lack of information about the ancestral population serving as a reference point for ibd, they assumed the least related pair of individuals in a sample have a coancestry of zero. We showed in WG17 that this brings estimates in line with path-counting predicted values when founders are assumed to be not inbred and unrelated, but we prefer to avoid the assumption. We stress that, absent external information or assumptions, F is not estimable. Instead, linear functions of F that describe ibd of target pairs of alleles relative to ibd in a specified set of alleles are estimable and have utility in empirical studies.Runs of homozygosityEach of the inbreeding estimators considered so far has been constructed for individual SNPs and then combined over SNPs. Observed values of allelic state are used to make inferences about the unobserved state of identity by descent. Estimators based on ROH, however, suppose that ibd for a region of the genome can be observed. Although F is the probability an individual has ibd alleles at any single SNP, in fact ibd occurs in blocks within which there has been no recombination in the paths of descent from common ancestor to the individual’s parents. Whereas a single SNP can be homozygous without the two alleles being ibd, if many adjacent SNPs are homozygous the most likely explanation is that they are in a block of ibd (Gibson et al., 2006). There can be exceptions, from mutation for example, and several publications give strategies for identifying runs of homozygotes for which ibd may be assumed (e.g., Gazal et al. (2014); (Joshi et al., 2015)). These strategies include adjusting the size of the blocks, the numbers of heterozygotes or missing values allowed per block, the minor allele frequency, and so on. These software parameters affect the size of the estimates (Meyermans et al., 2020). Some methods (e.g., Gazal et al. (2014); (Narasimhan et al., 2016)) use hidden Markov models where ibd is the hidden status of an observed homozygote. Model-based approaches necessarily have assumptions, such as HWE in the sampled population.We provide more details elsewhere, but we note here that ROH methods offer a useful alternative to SNP-by-SNP methods even though they cannot completely compensate for lack of information on the ibd reference population. We note also that shorter runs of ibd result from more distant relatedness of an individual’s parents, and ROH procedures can be set to distinguish recent (familial) ibd from distant (evolutionary) ibd. SNP-by-SNP estimators do not make a distinction between these two time scales. More

  • in

    Species richness and identity both determine the biomass of global reef fish communities

    Reef life surveyReef fish communities were censused by a combination of experienced marine scientists and trained recreational SCUBA divers using globally standardized Reef Life Survey methods. All surveys were undertaken on 50 m long transects laid along a contour (at consistent depth) on predominantly hard substrate (usually rocky or coral reef) in shallow waters (depth range of transects 1 to 20 m, average ~7.2 m). Full details of fish census methods, data quality, and training of divers are provided in refs. 22,34,35 and in an online methods manual (www.reeflifesurvey.com). Fish abundance counts and size estimates per 500 m2 transect area (2 ×250 m2 blocks) were converted to biomass using length–weight relationships for each species obtained from Fishbase (www.fishbase.org). In cases where length–weight relationships were provided in Fishbase using standard length or fork length, rather than total length as estimated by divers, length–length relationships provided in Fishbase allowed conversion to the total length. For improved accuracy in biomass assessments, observed sizes were also adjusted to account for the bias in divers’ perception of fish size underwater using an empirical calibration36. Length–weight coefficients from similar-shaped close relatives were used for those species where length–weight relationships were not available in Fishbase. All transects were collapsed into a single average value of biomass for each species at a location to account for any differences in the total number of transect surveys performed.Decomposition of difference in ecosystem functioningOur equation was inspired by previous decompositions, principally the Price equation originally derived in the field of evolutionary biology as a means of separating genetic and environmental influences on phenotypic change over time37. Fox38 and later Fox and Kerr12 modified the Price equation to describe how the difference in the ecological function between two communities can be decomposed into components with different ecological interpretations. We follow a similar approach but use a different decomposition where the resulting components are similar to, but not the same as, the components proposed by Fox and Kerr12.We begin by assuming that the ecological function of the community, such as biomass, is a simple additive function of the contributions of its constituent species. We go on to compare two communities, one of which we consider the “reference” community and the other we refer to as the “comparison” community. The species present in the reference community can be classified into two types: species that are unique to the reference community (i.e., not present in the comparison community) and those that are in common with the comparison community. Let suB be the number of unique species in the reference community, and sc be the number in common between the two communities. Let ({bar{z}}_{{uB}}) be the average ecological function contributed per unique species to the reference community, and ({bar{z}}_{{cB}}) be the average ecological function contributed per shared species in the reference community. The total ecological function TB of the reference community can thus be decomposed as:$${T}_{B}={s}_{{uB}}{bar{z}}_{{uB}}+{s}_{c}{bar{z}}_{{cB}}$$
    (1)
    where the first term represents the ecological function contributed by species that are unique to the reference community (i.e., not present in the comparison community) and the latter term represents the contribution from species that are also found in the comparison community.Analogously, in the comparison community, the total ecological function can be decomposed as:$${T}_{F}={s}_{{uF}}{bar{z}}_{{uF}}+{s}_{c}{bar{z}}_{{cF}}$$
    (2)
    with a similar interpretation to Eq. (1). Though there are sc species in common between the two communities, the average per species contribution need not be the same in the two communities (i.e., ({bar{z}}_{{cB}}) may differ from ({bar{z}}_{{cF}})).The species in common between the two communities can serve as a reference point for comparison between communities. It is useful to define ({delta }_{B}={bar{z}}_{{uB}}-{bar{z}}_{{cB}}) and ({delta }_{F}={bar{z}}_{{uF}}-{bar{z}}_{{cF}}) as the difference in average ecological function per species of unique species versus shared species in reference and comparison communities, respectively. From this perspective, we consider the average ecological function of a species unique to the reference community as being equal to the average ecological function of shared species (as measured in the same community) plus the deviation from this value ({bar{z}}_{{uB}}={bar{z}}_{{cB}}+{delta }_{B}). Using this equality and the analogous one for ({bar{z}}_{{uF}}), along with Eqs. (1) and (2), the difference in the ecological function between communities can be decomposed as$$Delta T={T}_{F}-{T}_{B}={-s}_{{uB}}{bar{z}}_{{cB}}-{s}_{{uB}}{delta }_{B}+{s}_{{uF}}{bar{z}}_{{cF}}+{s}_{{uF}}{delta }_{F}+{s}_{c}left({bar{z}}_{{cF}}-{bar{z}}_{{cB}}right)$$
    (3)
    The first two terms represent the loss in ecological function in the comparison community due to the loss of species that are unique to the reference community. Specifically, the first term represents the loss in ecological function due to the absence of unique species if these species had the same average value of functioning as each of the shared species. In other words, it is the amount by which biomass is expected to decline if species were interchangeable. Therefore, we interpret this term as the “richness loss” or the loss in functioning due strictly to the loss of species: RICH-L ((={-s}_{{uB}}{bar{z}}_{{cB}})). It will always be negative, assuming there is at least one species unique to the reference population. In cases where ({bar{z}}_{{cB}} > {bar{z}}_{{uB}}), it is possible for RICH-L to exceed the total functioning observed at the reference site, which complicates interpretation of the raw values. In this case, it is useful to consider only the relative quantities (each component is scaled by the sum of the absolute values of all components). We note that this situation arises only 41 times out of 2867 comparisons in our analysis, and removing these cases has no effect on our findings. We advise future applications be aware of this potential issue and test for its influence.The second term accounts for the fact that the true loss in ecological function due to these lost species will often differ from the “richness expectation” because the lost species differ in value from the average value of shared species. In other words, this term reflects the deviation in the actual contributions of lost species from the average of shared species, which implies that not all species contribute equally (and that the identities of the species are important in determining differences in biomass between the two communities). We, therefore, interpret this term as indicating “compositional loss,” or the degree to which loss in biomass is due to loss of particular species: COMP-L ((= – {s}_{{uB}}{delta}_{B})). If the average lost species provide a higher contribution to the reference community than the average shared species (({bar{z}}_{{uB}} > {bar{z}}_{{cB}})), the COMP-L term will be negative. On the other hand, if the average lost species represent lower contributions, the COMP-L term will be positive (({bar{z}}_{{uB}} < {bar{z}}_{{cB}})).The next two terms are analogous to the first two terms but instead represent the increase in ecological function in the comparison community due to the “gain” of unique species that are lacking from the reference community. The third term represents the expected increase in ecological function due to an increase in species richness assuming these gained species had the same per species contribution as the shared species: RICH-G ((={+s}_{{uF}}{bar{z}}_{{cF}})). It is always positive, assuming the comparison community has at least one unique species. The fourth term, COMP-G ((=+{s}_{{uF}}{delta }_{F})), reflects the difference in composition (with respect to average value) of gained versus shared species. This term can be positive or negative, being positive if the gained species have a higher per species value than the shared species.The final term focuses on the changes in biomass considering only the species that are present in both communities. This can be thought of as holding richness and composition constant and considering changes in the community biomass that are controlled extrinsically, i.e., by underlying gradients in resource availability and other environmental factors. Historically, this term has been referred to as the “context-dependent effect,” or CDE, and is the number of shared species (({s}_{c})), multiplied by the difference in biomasses among shared species at both sites ((={s}_{c}({bar{z}}_{{cF}}-{bar{z}}_{{cB}}))). It can be of either sign: positive if shared species have a higher value in the comparison community than in the reference, negative if they have a higher value in the reference community. The number of shared species has the potential to bias away from the CDE term if it is very low. However, we note that, on average, 49.1 ± 0.003% of species are shared for each comparison at the 100-km scale, and this value is remarkably consistent regardless of spatial scale (51.3–50.0% for 15–50 km).Our decomposition is similar to, but not the same as, that of Fox and Kerr12, though both are mathematically sound. Only the CDE term is mathematically identical across the two decompositions and, thus, shares the same interpretation. By extension, the sum across the loss and gain terms (the total diversity effect, or DIV) must also be identical, because both equations partition the same total quantity. Thus, it is important to note that using either decomposition yields the same inference with respect to comparisons of DIV and CDE.Our decomposition differs from Fox and Kerr’s because the two approaches use different reference points. We take the perspective that the shared species form the basis for comparison between two communities, so we then evaluate the average value of a unique species with respect to its deviation from an average value of a shared species. In contrast, Fox and Kerr effectively evaluate the average value of a unique species with respect to its deviation from the average value of any species in that community (averaging over both unique and shared species). In both decompositions, the “composition” components only exist if there is some difference in the average value of shared and unique species. We prefer our decomposition for this case because it works with that difference directly rather than indirectly via the difference between unique and all species (which is the average of unique and shared species). Moreover, our composition makes intuitive sense that the function of the “average” species is determined by the ones that are known to exist at both sites. A full comparison of the Fox and Kerr formulation and ours is provided in the Supplementary Materials.Statistical analysisA general function to conduct our new decomposition from a site-by-species biomass matrix, and a second function to perform the simulations, can be found here: https://gist.github.com/jslefche/76c076c1c7c5d200e5cb87113cdb9fb4.We first ordered all sites by decreasing total biomass. Beginning with the highest biomass site of all sites as the first reference site, we identified all other sites within a certain spatial radius (15-, 25-, 50-, or 100-km) to serve as the comparison sites. Setting the reference to be the site with the highest community biomass constrains the sum of the terms to be negative. This choice simplifies the language used to discuss the output13 and allows us to speak directly to the consequences of real-world activities like overharvesting (and their implications).We then computed the components for each set of comparisons. We standardized the output to the same scale (−1, 1) by first taking the sum of the absolute value of all components, and then dividing each component by this value. This relativization was done to account for the fact that raw biomass may differ substantially among sites and regions and to make our results comparable across the entire dataset. Once the scaled components were computed, the reference and comparison sites were removed from the ordered list from any further comparisons to prevent any bias that might arise from including the same site multiple times. We then moved onto the next most productive site in the list, identified the comparison sites within 100 km, computed the components, and so on, until all sites were analyzed. From these individual comparisons, we computed the means of all components while omitting any reference sites for which there were fewer than five comparison sites. We alternately averaged the components for all comparisons for each reference site and then took the grand mean of these averaged values, although this additional level of aggregation did not qualitatively change our results (Supplementary Fig. 6). We have chosen to present the raw values in the main text to demonstrate the full range of variability inherent in the individual comparisons, which might otherwise be condensed by showing only the means for each reference site. We repeated the analysis over multiple spatial radii to assess whether the spatial extent and therefore the size and composition of the species pool, might influence our results.We calculated the relative strength of the total diversity effect vs. the context-dependent effect for each comparison as the ratio of DIV/CDE, and of compositional vs. richness losses as:$${{{{{rm{Q}}}}}}=frac{(-{s}_{{uB}}{delta }_{B}{-s}_{{uB}}{bar{z}}_{{cB}})}{{-s}_{{uB}}{bar{z}}_{{cB}}}=frac{{bar{z}}_{{uB}}}{{bar{z}}_{{cB}}}$$ (4) In this case, Q = (COMP-L + RICH-L)/RICH-L, which reduces to the average value of unique species relative to the average value of shared species at the reference site. This quantity reflects the magnitude to which species unique to the reference site contribute to biomass relative to the “expected” contribution per species. To avoid biases associated with averaging ratios, we report the geometric mean of both quantities. Bootstrapped 95% confidence intervals were derived by randomly resampling DIV/CDE and Q for a total of 5000 times. For DIV/CDE, some values were negative, so we excluded them in both the original data and bootstrap samples. As an alternative approach that focused on the magnitude of effect, we examined the absolute value of |DIV | / | CDE | . In this case, the ratio was 6.9x with bootstrap 95% CIs of [6.2, 7.7].To explore the drivers of the components of our decomposition, we applied random forest analysis to account for potential collinearity and interactions among the suite of predictors previously selected in ref. 39. Depth was recorded on the surveys while the following predictors were obtained from the combination of remote sensed and in situ measurements compiled in the Bio-ORACLE database: mean, minimum, maximum, and range of sea surface temperature; mean, minimum and maximum for surface chlorophyll-a; mean salinity; mean PAR; mean dissolved oxygen; mean nitrate concentration; mean phosphate concentration40. Finally, an index of human population density was calculated by fitting a smoothly tapered surface to each settlement point on the year 2010 world-population density grid using a quadratic kernel function described previously41. Random forests were fit using the default settings in the randomForest package42 in R version 4.1.143. Variable importance was determined using the percent increase in the mean-square error after randomly permuting the predictor of interest for each tree in the random forest, averaging the error of the models, and then computing the difference relative to the accuracy of the original model.Null simulationsA key finding of our analysis is that compositional losses are considerably greater than losses due to other aspects of the reef fish community. We wanted to evaluate the possibility of whether such a result could be an artifact of applying our decomposition to a dataset in which we assign the site with the higher total biomass as the “reference” community and the site with lower total biomass as the “focal” community. To do so, we conducted simulations in which we created communities with species richness values matching the observed data, but for community compositions that were random. Following the same procedure we used with the real communities, we applied our decomposition to these simulated communities to generate null distributions for the average values of each of the five terms when community composition is random. Comparing our observed values to these null distributions tells us if the values of the compositional components (or indeed any component) we observed arose as an artifact of our procedure or, alternatively, because high-biomass sites actually contain more high-biomass species than expected under random community assembly.Our simulation procedure focused on the site-by-species biomass matrix from each set of comparisons used in the main 100-km analysis. We divided this matrix by the corresponding site-by-species abundance matrix to yield the observed per capita contribution of each species in each community. We then averaged the per capita contributions of each species across all communities where the species was present to yield a single vector representing mean per capita contributions for all S species within that set of comparisons.We initially constructed each simulated community by populating it with every species in the region (“maximum richness”). To determine the biomass of each species in each community we applied the following procedure. First, we identified the minimum and maximum observed abundance of each species across all communities where it is present. For a single community, we sampled an integer value between the minimum and maximum abundance for each species to yield a single vector of random abundance values of length S, and then multiplied this vector by the vector of average per capita contributions. This procedure yielded a new vector representing a new total contribution to biomass by every species. We repeated this for all n communities in the original site-by-species matrix and bound these vectors together in a new “maximum richness” version of the site-by-species matrix. For the ith row (community) in the original dataset, we calculated the richness, si. We then randomly subsampled si species at random from the simulated “maximum richness” site-by-species matrix and set the biomass of any remaining species to zero. We repeated this for each community to yield a simulated “observed richness” site-by-species matrix with the same dimensions as the original matrix. This procedure ensures that richness is held at the observed levels and that the biomass contribution of each species are within the observed range.These communities were intentionally constructed randomly with respect to composition as our goal was to test whether the observed compositional effects in the real data are significantly different than under this null hypothesis with respect to composition. Thus, using the simulated “observed richness” site-by-species matrix, we computed the (scaled) components as we had with the real data and took their means across all communities. We repeated the randomization procedure 1000 times to yield 1000 total average values of each component. We compared the observed mean to the distribution of expected means using a one-tailed t-test to determine whether the observed components were more or less extreme than would be expected by chance.Reporting SummaryFurther information on research design is available in the Nature Research Reporting Summary linked to this article. More

  • in

    Carbon response of tundra ecosystems to advancing greenup and snowmelt in Alaska

    Study sites: site description and climatic limitIn this study, we focused on seven flux tower sites located in Alaska, United States, including six AmeriFlux sites and one site of the Korea Polar Research Institute (KOPRI) (Fig. S1A, Table S1). Over the seven study sites, the annual mean temperature was between −10.09 and −0.55 °C and the annual total precipitation ranged from 287 to 540 during 2001–2018 based on the North American Regional Reanalysis (NARR41, 0.3-degree resolution every 3 h). The annual mean temperature increased from 2001 to 2018 at all sites, with rates between 0.5 and 2.2 °C per decade (p  0.05 at all sites). Our study sites were mostly dominated by wet sedges, grasses, moss, lichens, and dwarf shrubs. For example, the dominant plants at the US-Atq site (at a higher latitude) are herbaceous sedges (Carex aquatilis, Eriophorum russeolum, and Eriophorum angustifolium) and shrubs (Salix rotundifolia), with abundant mosses (Calliergon richardsonii and Cinclidium subrotundum) and lichens (Peltigera sp.)11. At the KOPRI site (at a lower latitude), mosses (Sphagnum magellanicum, Sphagnum angustifolium, and Sphagnum fuscum), lichens (Cladonia mitis, Cladonia crispata, and Cladonia stellaris), and tundra tussock cottongrass (Eriophorum vaginatum) are abundant39. The active layer thickness is between 0.33 and 1.0 m, according to field data and radar-based estimates.Climatic limits imposed by temperature, water, and radiation were quantified following Nemani et al.42 at each site during the GS between 2001 and 2018 (Fig. S2) using the NARR data. In this study, we defined the GS as from May to Oct., early GS as between May and Jun., peak GS as between Jul. and Aug., and late GS as between Sep. and Oct. For a temperature limit scalar, the monthly mean temperature from −5  to 5 °C was linearly scaled between 100% (i.e., no growth) and 0% (i.e., no reduction in growing days). The monthly ratio of precipitation to potential evapotranspiration (PET by the Priestley-Taylor method43), ranging between 0 and 0.75, was linearly scaled from 100 to 0% as a water limit scalar. A radiation limit scalar was estimated as a 0.5% reduction in growing days for every 1% increase in monthly cloudiness above the 10% threshold (monthly cloudiness (n) was estimated44 as (R={R}_{0}(1-0.75{n}^{3.4})), where R and R0 are the monthly mean incoming radiation and clear-sky radiation45, respectively).The carbon flux response to climatic variations at each site was further analyzed using a forward stepwise multiple regression analysis11 between the NEE and meteorological variables (temperature, PAR, and VPD) using tower data during the GS. Interaction terms among the variables are also included to consider the convolved effects of the variables (Eq. (1)).$${Y}_{{{NEE}}}= {beta }_{0}+{beta }_{1}{X}_{{{T}}}+{beta }_{2},{X}_{{{VPD}}}+{beta }_{3}{X}_{{{PAR}}}+{beta }_{4}{X}_{{{T}}}* {X}_{{{VPD}}}+ldots$$$$qquad;;; {beta }_{5}{X}_{{{T}}}* {X}_{{{PAR}}}+{beta }_{6}{X}_{{{VPD}}}* {X}_{{{PAR}}}+{beta }_{7}{X}_{{{T}}}* {X}_{{{VPD}}}* {X}_{{{PAR}}}$$
    (1)
    where YNEE is the daily average NEE (µmol m−2 s−1), and XT, XVPD, and XPAR are daily average air temperature (°C), VPD (ha), and PAR (µmol Photon m−2 s−1), respectively. Regression coefficients (({beta }_{0},ldots ,{beta }_{7})), standard errors, significance (P-value), and R2 value of the final regression model are summarized in Table S2.MODIS: long-term trends of snowmelt and greenup timingsWe collected the gridded MODIS snow cover (MOD10A1.V00646 at a 500-m resolution every day) and phenology (MCD12Q2.V00634 at a 500-m resolution yearly) from the NASA Earthdata (https://earthdata.nasa.gov/). We estimated the snowmelt and snowpack timings at each site as the date when a logistic fit to the MODIS snow cover (quality flags of good and best) passed 0.1 each year. We rejected those snowmelt timings when the gaps in the daily MODIS snow cover were longer than 2 weeks around the time of the snowmelt event. The greenup and dormancy timings with a quality flag of best were taken from the MODIS phenology. Based on the spatial representativeness assessment (see Supplementary Note), we decided to use the snowmelt timing and greenup timing within a 1 × 1 pixel window.The significance of the long-term trends in greenup and snowmelt timings at each site was determined by Spearman’s rho and Mann-Kendall tests (Fig. S3). We further estimated the 95% confidence intervals of the trends from 3000 timing sets generated by bootstrap resampling from a normal distribution47 (mean equal to each greenup or snowmelt timing with three standard deviation set to 10 or 6.6 days, respectively, i.e., the root-mean-squared values between the ground data-based estimates and MODIS values in a 1 × 1 pixel window, Figs. S9 and S10).SGSI: snowmelt-growing season indexGrowing season index (GSI)48 is one of the novel phenology models49 and has been widely applied for the phenological representations of various ecosystems50,51. GSI is a product of three indices of climatic variables (Eq. (2), Fig. S5): daylength, VPD, and growing-degree-days (GDD)52. As a phenological measure for a given meteorological condition, we calculated the daily GSI for spring (from Jan. 1 to Jul. 31) and fall (from Aug. 1 to Dec. 31), respectively. For the spring-GSI, GDD is the degree sum when the daily mean temperature rises above −5 °C after Jan. 1. For the fall-GSI, GDD is the degree sum when the daily mean temperature falls below 20 °C after Aug. 1. We then revised the GSI by multiplying it by a snowmelt index (iS) and referred to this as the snowmelt-GSI (SGSI, Eq. (3), Fig. S5). This guarantees that vegetation greenup does not start unless snow is melted, even if the meteorological conditions are sufficient to trigger leaf-out. The iS was estimated to be 0 when the snow cover fraction (snowfac variable53 in ED2) was above 0.1 and 1 otherwise.$${{{GSI}}}={{iX}}_{1} times {{iX}}_{2} times {{iX}}_{3}$$
    (2)
    $${{{SGSI}}}={{{GSI}}} times {iS}$$
    (3)
    where iX (X1, X2, and X3 represent daylength, VPD, and GDD, respectively) is 0 (X ≤ Xmin), 1 (X ≥ Xmax), and (X − Xmin)/(Xmax − Xmin) otherwise. Xmax and Xmin are the maximum and minimum thresholds of each index, respectively. For the spring-GSI, Xmin was calculated as the minimum value among the values on the greenup day (from MCD12Q2.V006) for the study period of 2001–2018 at each study site, and Xmax was calculated as the minimum value among the values on the maturity day. For the fall-GSI, similarly, Xmin was the minimum value for the dormancy timings, and Xmax was the minimum value for the senescence timings. We incorporated GSI (or SGSI) into ED2 by multiplying it to the optimal value of leaf biomass on the day, where it operates as an upper limit of the leaf biomass.In this study, it was assumed that phenological stages are driven by meteorological conditions, not by other factors (e.g., no assumption of fixed phenological periods6,54,55). The development of a robust phenological model for the tundra ecosystem would be enabled by an increasing amount of ground-based phenology data (e.g., PhenoCam data37).Case study: flux data analysisThere were three sites where flux data is available for >5 years in Alaska; US-Atq site (flux data during 2004–2008 with delayed snowmelt in 2005), US-EML site (flux data during 2009–2017 with delayed snowmelt in 2017), and US-BZF (flux data during 2012–2018 with delayed snowmelt in 2017 and 2018). We first calculated two timings that are related to the meteorological conditions (0.1-GSI timing and half-max GSI timing51, Fig. S6) using the NARR data. The 0.1-GSI timing and half-max GSI timing were calculated on the day when the GSI passed 0.1 and the half-max value (i.e., 0.5), respectively, each year. To calculate two timings regarding the flux seasonal profile51 (source-sink transition timing and half-max productivity timing, Fig. S6), we used 30-min gap-filled FLUXNET201556 data (NEE and GPP; quality flags of measured or good) at the US-Atq site to calculate daily NEP (i.e., negative NEE) and daily GPP. At the US-EML and the US-BZF sites, we applied an open-source code called ONEFlux (Open Network-Enabled Flux processing pipeline for eddy-covariance data)57 using the ERA5 data (European Centre for Medium-Range Weather Forecast Reanalysis v558) which was downscaled with a quantile mapping method59. Using the gap-filled 30-min NEP and GPP data from the ONEFlux, we calculated the corresponding daily values and fitted a smoothing spline to the daily NEP and the daily GPP each year. The source-sink transition timing was defined as the day when the smoothing spline of the daily NEP passed zero in each year. The half-max productivity timing was set to the day when the smoothing spline of the daily GPP passed the half-max value in that year51.Further, we investigated whether the delayed snowmelt altered the relationships between meteorological conditions and the flux-threshold timings at each site based on (1) the correlation between the 0.1-GSI timing and the source-sink transition timing and (2) the correlation between the half-max GSI timing and the half-max productivity timing.ED2: model implementationWe used NARR data41 (0.3-degree resolution every 3 h; temperature, precipitation rate, pressure, v- and u-wind speed, downward longwave and shortwave radiation flux, and relative and specific humidity) for single-point ED2 implementation at each study site from 2001 to 2018. Vegetation structure (LAI, leaf and structural biomass, diameter at breast height, and population density) was initialized for each site by using the maximum annual LAI of cold-adapted shrubs and Arctic C3 grass from the Ent Global Vegetation Structure Dataset v1.0b (Ent-GVSD v1.0b) with the allometric equations in ED2. Ent-GVSD v1.0b provides plant functional types (from the MODIS land cover, MCD12C1.V00560) and maximum annual LAI values (from the MODIS LAI, MOD15A2.V00461,62) in subgrid cover fractions. We did not use canopy heights from Ent-GVSD v1.0b because of the absence of trees at the study sites. Soil texture (the ratio of sand:silt:clay) was set following the Harmonized World Soil Database v1.163 of the Food and Agriculture Organization of the United Nations (UN FAO). Soil carbon was initialized using the UN FAO Global Soil Organic Carbon Map64, and soil nitrogen was estimated using the soil C/N ratio of moist tundra (mean: 18.4)65.The prior distribution of each key variable was based on previous studies (Table S4), and 10,000 parameter sets were randomly generated from the prior distributions (the so-called Monte Carlo method). The best parameter set was selected based on statistical measures (r2 and root-mean-squared error) when compared to the data at the US-Atq site, i.e., NEP flux data for 2004–2006 and MODIS LAI data for 2003–2010 (MCD15A3H.V00666 at a 500-m resolution every 4 days) (Table S5). We then validated the performance of ED2 with this best parameter set by focusing on key ecosystem processes, such as NEP, ecosystem respiration, soil temperature, snowmelt timing, greenup timing, and the LAI at all sites (Table S5). The ED2 LAI was overestimated by 0.15–0.16 compared to the field-measured LAI values (Jul.–Aug. in 2006 at Barrow67 and Jun.–Aug. in 1996 at Toolik68).It is worth noting that the accuracy of the MODIS LAI has not been extensively evaluated at high latitudes because of limited ground measurements and few valid MODIS data points due to inadequate sun-sensor geometry, illumination conditions, and cloud contamination69,70. Furthermore, the heterogeneous landscapes of the region at the scale of remote sensing data (from hundreds of meters to a few kilometers) are also a major challenge that must be addressed before the data can be evaluated against ground measurements. According to the spatial representativeness assessment (see Supplementary Note, and Figs. S7 and S8), the landscapes around the flux towers generally have heterogeneity at a level similar to, or smaller than, the tower footprint size (200–300 m) during the early GS and peak GS in the MODIS 1 × 1 pixel window (i.e., 500 × 500 m2), but mostly higher than in the 3 × 3 pixel window (Table S6). This indicates that it is desirable to evaluate the MODIS 1 × 1 pixel LAI values against ground measurements, as both MODIS greenup and snowmelt timings agreed more with the ground data at the 1 × 1 pixel window scale than at the 3 × 3 pixel window scale (Figs. S9 and S10). A more thorough evaluation of both MODIS LAI data and ED2 LAI values is required in the coming years with the increase in ground data availability (e.g., National Ecological Observatory Network, NEON, LAI measurements).Correlation analysis: the effects of early or delayed snowmelt timingTo analyze the net and lagged effects of early or late snowmelt timing, it is necessary to constrain the contribution of interannual meteorological variation. Therefore, we compared only the years when meteorological conditions were similar, i.e., when the weekly mean GSI value was within one standard deviation of the weekly mean GSI during 2001–2018 (at the US-Hva site, weekly values during 1994–2018); meteorological conditions appeared similar for 8 or 9 years at each study site, except the US-BZF site, where the similarities were found for 10 years. We also limited the effect of greenup timing changes by excluding the years when greenup was earlier or later by one standard deviation of the mean of greenup timings during the study period. For the years satisfying the constraints, a least-squares linear regression was applied between the snowmelt timing change (deviation in snowmelt timing each year from the mean snowmelt timing) and the seasonal deviation (the difference of the seasonal mean from the mean value over the years) of each process from the ED2 results.To analyze the net and lagged effects of delayed snowmelt, we implemented the ED2 model in two schemes, (1) following the meteorologically-determined phenological index (i.e., the GSI, Eqs. (2) and (2) constraining leaf-out by snowmelt (i.e., the SGSI, Eq. (3)). For the years when unmelted snow delayed greenup, we took the difference between the modeled results (i.e., GSI and SGSI) for each process and applied a least-squares linear regression between the difference of each process and the delayed snowmelt days. More

  • in

    Parasitoid vectors a plant pathogen, potentially diminishing the benefits it confers as a biological control agent

    Insect rearingA CLas negative colony of ACP was initially collected from CLas-free Murraya exotica L. growing in the ornamental landscape of South China Agricultural University (SCAU, Guangzhou, China) in May 2014. Then it was reared on potted M. exotica in a greenhouse at SCAU. M. exotica plants were pruned regularly to promote the growth flushes necessary to stimulate ACP oviposition. The ACP populations were periodically (at least once a month) tested to ensure the colony was CLas-free using nested quantitative PCR detection according to the method described by Coy et al.30.The parasitoid T. radiata used in the current study was initially collected from ACP hosts on M. exotica plants in the above-mentioned location during June 2015. Its population was maintained in rearing cages (60 × 60 × 60 cm) using a CLas-free ACP-M. exotica rearing system under laboratory conditions (26 ± 1 °C, RH 80 ± 10% with L:D = 14:10 photoperiods in insect incubators).Host plantsCLas-free and CLas-infected plants of Citrus reticulata Blanco cv. Shatangju were used in the current study. Both plant types were obtained from The Citrus Research Institute of Zhaoqing University (Guangdong, China). The CLas-infected plants were inoculated by shoot grafting. All plants were approximately 4-year old and 1.2−1.5 m in height, separated in nylon net greenhouses (70 mesh per inch2) at two different locations about 2.2 km apart in SCAU. Again, nested qPCR detection was performed periodically (at least once a month) to test for the presence or absence of CLas in the citrus plants according to the method described by Coy et al.30.Acquisition and persistence of CLas in Tamarixia radiata
    When new shoots of CLas-infected C. reticulata plants were grown to 5–8 cm, 20 pairs of 1 week-old ACP adults were introduced into one nylon bag covering one fresh shoot to lay eggs for 48 h. When the progeny of ACP developed through to 4th or 5th instar nymph (CLas-donor ACP), which are the stages preferred by T. radiata parasitoids, 150 of the ACP nymphs were randomly selected and the remaining ones were removed. Following this, 10 pairs of 3-day old T. radiata adults, randomly selected from the population that has been tested to be CLas-free, were introduced into the nylon bag in order to parasitize the 4th or 5th instar ACP nymphs for 48 h before being recaptured. Then the potentially parasitized ACP nymphs together with the citrus plants were cultured in a plant growth chamber (Jiangnan Instrument Company, RXZ-500D, at 26 ± 1 °C, 60 ± 2% RH and 14:10 h L:D photoperiod of 3,000 lx illumination).When the progeny of T. radiata (considered F0 generation) developed to 3-day egg, 1st to 4th instar larvae, pupae, and adult stages respectively, they were identified and collected with the assistance of a stereomicroscope. DNA of each stage sample was extracted using the TIANamp Genomic DNA Kit (TIANGEN, Beijing, China) for CLas qPCR detection and titer quantification. Thirty eggs, 20 individuals of 1st or 2nd instar, 10 individuals of 3rd or 4th instar larvae or pupa, as well as three individuals of female or male adults were subsequently ground together to represent each life stage in qPCR, and each stage qPCR detection was repeated three times.The primers used for CLas qPCR detection were LJ900 primers, (F5′-GCCGTTTTAAC ACAAAAGATGAATATC-3′, R5′-ATAAATCAATTTGTTCTAGTTTAC GAC-3′), and 18S rRNA gene of T. radiata (F5′-AAACGGCTACCACATCCA-3′, R5′-ACCAGACT TGCCCTC CA-3′)31 was used as an internal control for DNA normalization and quantification. In order to normalize the qPCR values, each qPCR reaction was performed in three independent runs using SYBR Premix Ex Taq (Takara, Dalian, China) in Bio-Rad CFX Connect™ Real-Time PCR Detection System, with a protocol of initial denaturation at 95 °C for 3 min, followed by 40 cycles at 95 °C for 10 s, 60 °C for 20 s and 72 °C for 30 s.To monitor the CLas persistence in T. radiata, newly emerged female adults of T. radiata (considered F1 generation) were collected from the above experiment and fed with 20% honey water. After 1, 5, 10, and 15 days, 10 parasitoids were recaptured, subsequently ground for DNA extraction and CLas titer detection and quantification using qPCR. The protocol of DNA extraction and qPCR reaction was the same as above, and qPCR quantification was repeated three times for each treatment.Localization patterns of CLas in Tamarixia radiata
    Localization patterns of CLas in different instars of T. radiataFluorescent in situ hybridization (FISH) was used to visualize the distribution of CLas in T. radiata exposed to CLas positive ACP, following the method of Gottlieb et al.32 with a slight modification. Eggs and different larval instars of T. radiata were collected and fixed in Carnoy’s solution (chloroform-ethanol-glacial acetic acid [6:3:1,vol/vol] formamide) overnight at 4 °C. After fixation, the samples were washed three times in 50% ethanol with 1× phosphate buffered saline (PBS) for 5 min. Then the samples were decolorized in 6% H2O2 in ethanol for 12 h, after which they were hybridized overnight in 1 ml hybridization buffer (20 mM Tris-HCl pH 8.0, 0.9 M NaCl, 0.01% sodium dodecyl sulfate, 30% formamide) containing 10 pmol of fluorescent probes/ml in a 37 °C water bath under dark conditions. The CLas probe used for FISH was 5′-Cy3-GCCTCGCGACTTCGCAACCCAT-3′. Finally, the stained T. radiata samples were washed three times in a washing buffer (0.3 M NaCl, 0.03 M sodium citrate, 0.01% sodium dodecyl sulfate, 10 min per time). After the samples were whole mounted and stained, the slides were observed and photographed using a Nikon eclipse Ti-U inverted microscope. For each stage sample, approximately 20 individuals were examined to confirm the results.Localization patterns of CLas in different organs of T. radiataDifferent organs (gut, fat body, ovary, poison sac, salivary glands, spermatheca, and chest muscle) were dissected from newly emerged adults of T. radiata in 1× phosphate buffered saline (PBS) under a stereomicroscope using a depression microscope slide and a fine anatomical needle. After a sufficient number of each tissue sample was collected (20 or more), the tissues were washed three times with 1 × PBS, followed by the fixation, decolorization, and hybridization procedures as outlined above, except that this time of decolorization was 2 h. After hybridization, nuclei in the different organs were counterstained with DAPI (0.1 mg/ml in 1 × PBS) for 10 min, then the samples were transferred to slides, mounted whole in hybridization buffer, and viewed using confocal microscopy (Nikon, Japan).Maternal transmission of CLas between Tamarixia generationsFive groups of experiments were used to clarify whether CLas can be transmitted vertically between different T. radiata generations. In the first group, 60 pairs of newly emerged T. radiata adults from the CLas-infected ACP colony (potential CLas-acquired parasitoid adults, F0 generation) were introduced into 60 nylon bags (one female per cage). Each bag covered one fresh citrus plant shoot with one marked CLas-free 4th instar nymph of ACP, the parasitoid females were given 24 h to oviposit, then transferred to another four groups successively to oviposit with intervals of 24 h before they were recaptured for CLas-PCR detection (58/60 and 56/60 T. radiata females and males respectively were CLas-infected). Only the progeny (F1 generation) in which parasitoid parents were both CLas-infected continued to be investigated.When the F1 progeny of CLas-infected parasitoid females developed to egg, larval, pupal, and adult stages respectively, they were collected and divided into two groups; in one group samples were used for the qPCR detection of the CLas titer, and the other group was used for the FISH visualization of CLas. The qPCR and FISH analysis protocols of CLas as well as the number of tested individuals were the same as previously outlined. Each stage was repeated three times.
    CLas detection in T. radiata-inoculated ACPQuantitative PCR detection of CLasApproximately 60 newly emerged parasitoid adult females from CLas-infected ACP hosts (potential CLas-acquired parasitoid adults) were collected using an aspirator. They were first starved for 5 h, then released into finger tubes (diameter 6 mm × length 30 mm); one female per tube containing one 4th instar nymph of CLas-free ACP (this was treated as one experimental replicate). The probing behavior of the parasitoids was observed under a stereomicroscope, after which the parasitoids were recaptured for CLas PCR detection (similar to the above experiment, approximately 95% were CLas-infected). Only those 4th instar ACP nymphs, probed for egg-laying by a CLas-infected parasitoid but survived from the probing (the averaged proportion of such samples was 5.36 ± 0.47% and were 100% CLas infected), were transferred onto fresh CLas-free M. exotica shoots to complete their development (hereafter referred as “T. radiata-inoculated ACP”). The experiment was repeated in 32 parallel replicates (Supplementary Table 1), in which 103 T. radiata-inoculated ACP nymphs were finally obtained.Following the above, thirty T. radiata-inoculated ACP nymphs were collected when they developed into 5th instar nymphs (the stage when infection proliferation might have just begun since the infection was introduced at the 4th instar). In addition, thirty 8-day old adults that developed from the T. radiata-inoculated ACP nymphs were also collected. This was because the results in Wu et al.28 revealed that the proportion of CLas-infected ACP individuals exceeds 90% at the 12th day after infection acquisition, while ACP takes 4 days to develop into an adult from 5th instar stage. Their alimentary canals and salivary glands were dissected under a stereomicroscope using the methods of Ammar et al.33, and hemolymphs were collected with a 10 μl pipette tip using the method of Killiny et al.34. The DNA of the alimentary canals, salivary glands and hemolymphs were extracted using TIANamp Micro DNA Kit (Tiangen, Beijing, China), and the relative titers of CLas in each tissue of ACP nymphs and adults were detected by qPCR with of LJ900. The β-actin gene of ACP (F 5′-CCCTGGACTTTGAACAGGAA-3′; R 5′-CTCGTGGATACCGCAAGATT-3′) was selected as an internal control for data normalization and quantification35. For each sample, qPCR detection was repeated three times.FISH visualization of CLasThe alimentary canals and salivary glands of 5th instar nymphs and 8-day old adults of T. radiata-inoculated ACP were dissected as described above, and the distribution of CLas was visualized by FISH and confocal microscopy. The alimentary canals and salivary glands of CLas-infected ACP nymphs and adults (collected from CLas-infected citrus plants) were used as a positive control, and five to ten samples were detected by FISH for each tissue.
    CLas transmission from T. radiata-inoculated ACP to citrus plantsAccording to the above experimental results, if the CLas could be detected in the salivary glands of the 8-day old ACP adults (T. radiata-inoculated ACP), 30 more of these adults were randomly selected to inoculate on fresh shoots of CLas-free citrus. ACP adults that acquired CLas from plants and CLas-free ACP adults were used as positive and negative controls respectively.After 20, 30, 40, and 50 days of feeding samples of the citrus leaves fed on by T. radiata-inoculated ACP (named as CLas-recipient citrus leaves), fed on by ACP that acquired CLas from plants (positive control), and fed on by CLas-free ACP (negative control) were cut (1 cm2). Their DNAs were extracted using DNAsecure Plant Kit (Tiangen, Beijing, China). The infections of CLas in these plants were detected by nested PCR based on the methods of Jagoueix et al.36 and Deng et al.37. The experiment was repeated in six plants for each of 20, 30, 40, and 50 days feeding duration, and the infection rates of CLas were calculated.Localization of CLas in citrus plants fed on by T. radiata-inoculated ACPIn order to further confirm the infection of CLas in the recipient citrus leaves, FISH was used to visualize the localization of CLas. According to the above experimental results, after being fed on for 50 days by the T. radiata-inoculated ACP adults, citrus leaf sections containing the midrib were cross-sliced in 30 µ sections using a cryostat (CM1950, Leica, Germany). The leaf samples were prepared for FISH vitalization according to the protocol described by Gottlieb et al.32. Citrus leaves from the plant that had been fed on by ACP adults that acquired CLas from plants and CLas-free ACP adults were used as positive and negative controls, respectively. Five to 10 leaf samples were detected by FISH for each treatment.Phylogenetic analysis of CLas bacteria in different ACP populations and citrus plantsTo assess the identity of the CLas bacteria in CLas donor ACP, CLas vectored parasitoids, T. radiata-inoculated ACP and recipient citrus leaves, the outer membrane protein gene (omp) of CLas was PCR amplified with the primers HP1asinv (5′-GATGATAGG TGCATAAAAGTACAGAAG-3′) and Lp1c (5′-AATACCCTTATGGGATACAAAAA-3′) following the procedure described in Bastianel et al.38. Then the PCR products were sent for sequencing after visualizing the expected bands on 1% agarose gels.All the DNA sequences of CLas omp gene were edited and aligned manually using Clustal X1.8339 in Mega 640. The best model and partitioning scheme were chosen using the Bayesian information criterion in PartitionFinder v.1.0.141. Phylogenetic analysis was undertaken using a maximum likelihood (ML) method with 1000 non-parametric bootstrap replications in RAxML42. Escherichia coli was used as an outgroup.Statistics and reproducibilityTaking 18S rRNA gene of T. radiata and the β-actin gene of ACP as housekeeping genes, the relative titers of CLas in different stages and different tissues of T. radiata and ACP were calculated using the method of 2[−ΔΔct 43. For the parallel experiments that had more than three replicates the differences were compared using analysis of variance (ANOVA) with SPSS 18.0 at a significance level α = 0.05; while for CLas titer, two-sample comparison between genders of Tamarixia adults analysis was performed using paired t-test. Fluorescent pictures were processed using Photoshop CS5 software.Reporting summaryFurther information on research design is available in the Nature Research Reporting Summary linked to this article. More

  • in

    Skin irritation and potential antioxidant, anti-collagenase, and anti-elastase activities of edible insect extracts

    Insect extractsThai edible insects (Fig. 1) were extracted and yield of each extract is shown in Fig. 2. Hexane extracts of most insects, except for P. succincta, provided the highest yield, followed by ethanolic extracts, and aqueous extracts, respectively. The reason might be due to a high amount of fat content of insects. Since these fat components are hydrophobic, they could be extracted well using nonpolar solvent, e.g. hexane. Semi-polar solvent like ethanol could also be used to extract hydrophobic compounds but with less extraction efficacy5. Several previous studies reported that fat was abundant in biomass of insects, ranging from 4.2 to 77.2%, which was accounted for about 26.8% on average dried insects6,7.Figure 1External appearances of Thai edible insects, including (a) rice grasshopper (Euconocephalus sp.), (b) bamboo caterpillar (O. fuscidentalis), (c) house cricket (A. domesticus), (d) silkworm pupae (B. mori), (e) Bombay locust (P. succincta), and (f) giant water bug (L. indicus).Full size imageFigure 2Yields of insect extracts, including B. mori (BM), O. fuscidentalis (OF), Euconocephalus sp. (EU), P. succincta (PS), A. domesticus (AD), and L. indicus (LI). The data are expressed as mean ± SD (n = 3). The Greek alphabet letters (α, β, γ, and δ) indicate significant differences among hexane extracts, the capital letters (A, B, C, and D) indicate significant differences among ethanolic extracts, and the small case letters (a, b, and c) indicate significant differences among aqueous extracts. The data were analyzed using One-Way ANOVA followed by post hoc Tukey test (p  More

  • in

    Wolbachia reduces virus infection in a natural population of Drosophila

    1.Weinert, L. A., Araujo-Jnr, E. V., Ahmed, M. Z. & Welch, J. J. The incidence of bacterial endosymbionts in terrestrial arthropods. Proc. R. Soc. B: Biol. Sci. 282, 20150249 (2015).
    Google Scholar 
    2.Werren, J. H. Biology of Wolbachia. Annu Rev. Entomol. 42, 587–609 (1997).CAS 
    PubMed 

    Google Scholar 
    3.Turelli, M. & Hoffmann, A. A. Rapid spread of an inherited incompatibility factor in California Drosophila. Nature 353, 440–442 (1991).CAS 
    PubMed 

    Google Scholar 
    4.Werren, J. H., Baldo, L. & Clark, M. E. Wolbachia: master manipulators of invertebrate biology. Nat. Rev. Microbiol. 6, 741–751 (2008).CAS 
    PubMed 

    Google Scholar 
    5.Teixeira, L., Ferreira, A. & Ashburner, M. The bacterial symbiont Wolbachia induces resistance to RNA viral infections in Drosophila melanogaster. Plos Biol. 6, e2 (2008).PubMed 

    Google Scholar 
    6.Hedges, L. M., Brownlie, J. C., O’Neill, S. L. & Johnson, K. N. Wolbachia and virus protection in insects. Science 322, 702 (2008).CAS 
    PubMed 

    Google Scholar 
    7.Rocha, M. N. et al. Pluripotency of Wolbachia against Arboviruses: the case of yellow fever. Gates Open Res. 3, 161 (2019).PubMed 
    PubMed Central 

    Google Scholar 
    8.Moreira, L. A. et al. A Wolbachia symbiont in Aedes aegypti limits infection with dengue, Chikungunya, and Plasmodium. Cell 139, 1268–1278 (2009).PubMed 

    Google Scholar 
    9.Dutra, H. L. et al. Wolbachia blocks currently circulating zika virus isolates in Brazilian Aedes aegypti mosquitoes. Cell Host Microbe 19, 771–774 (2016).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    10.Aliota, M. T. et al. The wMel strain of Wolbachia reduces transmission of chikungunya virus in Aedes aegypti. PLoS Negl. Trop. Dis. 10, e0004677 (2016).PubMed 
    PubMed Central 

    Google Scholar 
    11.Schmidt, T. L. et al. Local introduction and heterogeneous spatial spread of dengue-suppressing Wolbachia through an urban population of Aedes aegypti. PLoS Biol. 15, e2001894 (2017).PubMed 
    PubMed Central 

    Google Scholar 
    12.Ryan, P. A. et al. Establishment of wMel Wolbachia in Aedes aegypti mosquitoes and reduction of local dengue transmission in Cairns and surrounding locations in northern Queensland, Australia. Gates Open Res. 3, 1547 (2020).PubMed 
    PubMed Central 

    Google Scholar 
    13.Indriani, C. et al. Reduced dengue incidence following deployments of Wolbachia-infected Aedes aegypti in Yogyakarta, Indonesia: a quasi-experimental trial using controlled interrupted time series analysis. Gates Open Res. 4, 50 (2020).PubMed 
    PubMed Central 

    Google Scholar 
    14.Zug, R. & Hammerstein, P. Bad guys turned nice? A critical assessment of Wolbachia mutualisms in arthropod hosts. Biol. Rev. Camb. Philos. Soc. 90, 89–111 (2015).PubMed 

    Google Scholar 
    15.Shi, M. et al. No detectable effect of Wolbachia wMel on the prevalence and abundance of the RNA virome of Drosophila melanogaster. Proc. Biol. Sci. https://doi.org/10.1098/rspb.2018.1165 (2018).16.Webster, C. L. et al. The discovery, distribution, and evolution of viruses associated with Drosophila melanogaster. PLoS Biol. 13, e1002210 (2015).PubMed 
    PubMed Central 

    Google Scholar 
    17.Pimentel, A. C., Cesar, C. S., Martins, M. & Cogni, R. The antiviral effects of the symbiont bacteria Wolbachia in insects. Front Immunol. 11, 626329 (2021).PubMed 
    PubMed Central 

    Google Scholar 
    18.Kriesner, P., Hoffmann, A. A., Lee, S. F., Turelli, M. & Weeks, A. R. Rapid sequential spread of two Wolbachia variants in Drosophila simulans. PLoS Pathog. 9, e1003607 (2013).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    19.Weeks, A. R., Turelli, M., Harcombe, W. R., Reynolds, K. T. & Hoffmann, A. A. From parasite to mutualist: rapid evolution of Wolbachia in natural populations of Drosophila. PLoS Biol. 5, e114 (2007).PubMed 
    PubMed Central 

    Google Scholar 
    20.Hoffmann, A. A. & Turelli, M. Unidirectional incompatibility in Drosophila simulans: inheritance, geographic variation and fitness effects. Genetics 119, 435–444 (1988).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    21.Cross, S. T. et al. Partitiviruses infecting Drosophila melanogaster and Aedes aegypti exhibit efficient biparental vertical transmission. J. Virol. https://doi.org/10.1128/jvi.01070-20 (2020).22.Webster, C. L., Longdon, B., Lewis, S. H. & Obbard, D. J. Twenty-five new viruses associated with the Drosophilidae (Diptera). Evolut. Bioinforma. online 12, 13–25 (2016).CAS 

    Google Scholar 
    23.Jousset, F. X. & Plus, N. Study of the vertical transmission and horizontal transmission of “Drosophila melanogaster” and “Drosophila immigrans” picornavirus (author’s transl). Ann. Microbiol. 126, 231–249 (1975).CAS 

    Google Scholar 
    24.Jousset, F. X., Plus, N., Croizier, G. & Thomas, M. Existence in Drosophila of 2 groups of picornavirus with different biological and serological properties. C. R. Acad. Hebd. Seances Acad. Sci. D. 275, 3043–3046 (1972).CAS 
    PubMed 

    Google Scholar 
    25.Kapun, M. et al. Genomic Analysis of European Drosophila melanogaster populations reveals longitudinal structure, continent-wide selection, and previously unknown DNA viruses. Mol. Biol. Evol. 37, 2661–2678 (2020).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    26.Medd, N. C. et al. The virome of Drosophila suzukii, an invasive pest of soft fruit. Virus Evol. 4, vey009 (2018).PubMed 
    PubMed Central 

    Google Scholar 
    27.Longdon, B. et al. The evolution, diversity, and host associations of rhabdoviruses. Virus Evol. 1, vev014 (2015).PubMed 
    PubMed Central 

    Google Scholar 
    28.Schoonvaere, K., Smagghe, G., Francis, F. & de Graaf, D. C. Study of the metatranscriptome of eight social and solitary wild bee species reveals novel viruses and bee parasites. Front. Microbiol. 9, 177 (2018).PubMed 
    PubMed Central 

    Google Scholar 
    29.Pettersson, J. H., Shi, M., Eden, J. S., Holmes, E. C. & Hesson, J. C. Meta-transcriptomic comparison of the RNA viromes of the mosquito vectors Culex pipiens and Culex torrentium in Northern Europe. Viruses https://doi.org/10.3390/v11111033 (2019).30.Mahar, J. E., Shi, M., Hall, R. N., Strive, T. & Holmes, E. C. Comparative analysis of RNA virome composition in rabbits and associated ectoparasites. J. Virol. https://doi.org/10.1128/jvi.02119-19 (2020).31.Martinez, J. et al. Symbionts commonly provide broad spectrum resistance to viruses in insects: a comparative analysis of Wolbachia strains. PLoS Pathogens https://doi.org/10.1371/journal.ppat.1004369 (2014).32.Cross, S. T. et al. Galbut virus infection minimally influences Drosophila melanogaster fitness traits in a strain and sex-dependent manner. Preprint at bioRxiv https://doi.org/10.1101/2021.05.18.444759 (2021).33.Yampolsky, L. Y., Webb, C. T., Shabalina, S. A. & Kondrashov, A. S. Rapid accumulation of a vertically transmitted parasite triggered by relaxation of natural selection among hosts. Evolut. Ecol. Res. 1, 581–589 (1999).
    Google Scholar 
    34.Wilfert, L. & Jiggins, F. M. The dynamics of reciprocal selective sweeps of host resistance and a parasite counter-adaptation in Drosophila. Evolution 67, 761–773 (2013).CAS 
    PubMed 

    Google Scholar 
    35.Chrostek, E., Martins, N., Marialva, M. S. & Teixeira, L. Wolbachia conferred antiviral protection is determined by developmental temperature. mBio 12, e0292320 (2021).PubMed 

    Google Scholar 
    36.Ortiz-Baez, A. S., Shi, M., Hoffmann, A. A. & Holmes, E. C. RNA virome diversity and Wolbachia infection in individual Drosophila simulans flies. J. Gen. Virol. 102, 001639 (2021).
    Google Scholar 
    37.Haine, E. R. Symbiont-mediated protection. Proc. Biol. Sci. 275, 353–361 (2008).PubMed 

    Google Scholar 
    38.Martinez, J. et al. Addicted? Reduced host resistance in populations with defensive symbionts. Proc. Biol. Sci. https://doi.org/10.1098/rspb.2016.0778 (2016).39.Cogni, R. et al. Variation in Drosophila melanogaster central metabolic genes appears driven by natural selection both within and between populations. P R. Soc. B-Biol. Sci. 282, 20142688 (2015).CAS 

    Google Scholar 
    40.Cogni, R. et al. On the long-term stability of clines in some metabolic genes in Drosophila melanogaster. Sci. Rep. https://doi.org/10.1038/srep42766 (2017).41.Longdon, B. et al. The causes and consequences of changes in virulence following pathogen host shifts. PLoS Pathogens https://doi.org/10.1371/journal.ppat.1004728 (2015).42.Longdon, B., Hadfield, J. D., Webster, C. L., Obbard, D. J. & Jiggins, F. M. Host phylogeny determines viral persistence and replication in novel hosts. PLoS Pathog. 7, e1002260 (2011).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    43.Meyer, M. & Kircher, M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb. Protoc. 2010, pdb.prot5448 (2010).PubMed 

    Google Scholar 
    44.Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    45.Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).CAS 
    PubMed 

    Google Scholar 
    46.Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    47.Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    48.Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).CAS 

    Google Scholar 
    49.Notredame, C., Higgins, D. G. & Heringa, J. T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217 (2000).CAS 
    PubMed 

    Google Scholar 
    50.Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    51.Coyle, M. C., Elya, C. N., Bronski, M. & Eisen, M. B. Entomophthovirus: an insect-derived iflavirus that infects a behavior manipulating fungal pathogen of dipterans. Preprint at bioRxiv https://doi.org/10.1101/371526 (2018).52.Longdon, B. et al. Vertically transmitted rhabdoviruses are found across three insect families and have dynamic interactions with their hosts. P Roy Soc B-Biol Sci https://doi.org/10.1098/rspb.2016.2381 (2017).53.Untergasser, A. et al. Primer3-new capabilities and interfaces. Nucleic Acids Res. 40, e115 (2012).CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    54.Ye, J. et al. Primer-BLAST: A tool to design target-specific primers for polymerase chain reaction. BMC Bioinforma. 13, 134 (2012).CAS 

    Google Scholar 
    55.Lefever, S., Pattyn, F., Hellemans, J. & Vandesompele, J. Single-nucleotide polymorphisms and other mismatches reduce performance of quantitative PCR assays. Clin. Chem. 59, 1470–1480 (2013).CAS 
    PubMed 

    Google Scholar 
    56.Hadfield, J. D. MCMC Methods for Multi-Response Generalized Linear Mixed Models: The MCMCglmm R Package. 2010 33, 22, (2010).57.Cogni, R., Ding, S. D., Pimentel, A. C., Day, J. P. & Jiggins, F. M. https://doi.org/10.5281/zenodo.5525967 (Zenodo 2021). More

  • in

    Increased microbial expression of organic nitrogen cycling genes in long-term warmed grassland soils

    1.Schmidt MWI, Torn MS, Abiven S, Dittmar T, Guggenberger G, Janssens IA, et al. Persistence of soil organic matter as an ecosystem property. Nature. 2011;478:49–56.CAS 
    PubMed 

    Google Scholar 
    2.Bond-Lamberty B, Bailey VL, Chen M, Gough CM, Vargas R. Globally rising soil heterotrophic respiration over recent decades. Nature. 2018;560:80–3.CAS 
    PubMed 

    Google Scholar 
    3.Bradford MA. Thermal adaptation of decomposer communities in warming soils. Front Microbiol. 2013;4:1–16.
    Google Scholar 
    4.Cavicchioli R, Ripple WJ, Timmis KN, Azam F, Bakken LR, Baylis M, et al. Scientists’ warning to humanity: microorganisms and climate change. Nat Rev Microbiol. 2019;17:569–86.CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    5.Jansson JK, Hofmockel KS. Soil microbiomes and climate change. Nat Rev Microbiol. 2020;18:35–46.CAS 
    PubMed 

    Google Scholar 
    6.Liu L, Greaver TL. A global perspective on belowground carbon dynamics under nitrogen enrichment. Ecol Lett. 2010;13:819–28.PubMed 

    Google Scholar 
    7.Knicker H. Soil organic N – An under-rated player for C sequestration in soils? Soil Biol Biochem. 2011;43:1118–29.CAS 

    Google Scholar 
    8.Soong JL, Fuchslueger L, Marañon-Jimenez S, Torn MS, Janssens IA, Peñuelas J, et al. Microbial carbon limitation: The need for integrating microorganisms into our understanding of ecosystem carbon cycling. Glob Chang Biol. 2020;26:1953–61.
    Google Scholar 
    9.Mooshammer M, Wanek W, Hämmerle I, Fuchslueger L, Hofhansl F, Knoltsch A, et al. Adjustment of microbial nitrogen use efficiency to carbon:nitrogen imbalances regulates soil nitrogen cycling. Nat Commun. 2014;5:1–7.
    Google Scholar 
    10.Geisseler D, Horwath WR, Joergensen RG, Ludwig B. Pathways of nitrogen utilization by soil microorganisms – a review. Soil Biol Biochem. 2010;42:2058–67.CAS 

    Google Scholar 
    11.Wang X, Wang C, Cotrufo MF, Sun L, Jiang P, Liu Z, et al. Elevated temperature increases the accumulation of microbial necromass nitrogen in soil via increasing microbial turnover. Glob Chang Biol. 2020;26:5277–89.PubMed 

    Google Scholar 
    12.Simpson AJ, Simpson MJ, Smith E, Kelleher BP. Microbially derived inputs to soil organic matter: Are current estimates too low? Environ Sci Technol. 2007;41:8070–6.CAS 
    PubMed 

    Google Scholar 
    13.Kuypers MMM, Marchant HK, Kartal B. The microbial nitrogen-cycling network. Nat Rev Microbiol. 2018;16:263–76.CAS 
    PubMed 

    Google Scholar 
    14.Walker TWN, Kaiser C, Strasser F, Herbold CW, Leblans NIW, Woebken D, et al. Microbial temperature sensitivity and biomass change explain soil carbon loss with warming. Nat Climate Change. 2018;8:885–9.CAS 

    Google Scholar 
    15.Marañón-Jiménez S, Peñuelas J, Richter A, Sigurdsson BD, Fuchslueger L, Leblans NIW, et al. Coupled carbon and nitrogen losses in response to seven years of chronic warming in subarctic soils. Soil Biol Biochem. 2019;134:152–61.
    Google Scholar 
    16.Nguyen TTH, Myrold DD, Mueller RS. Distributions of extracellular peptidases across prokaryotic genomes reflect phylogeny and habitat. Front Microbiol. 2019;10:1–14.
    Google Scholar 
    17.Zimmerman AE, Martiny AC, Allison SD. Microdiversity of extracellular enzyme genes among sequenced prokaryotic genomes. ISME J. 2013;7:1187–99.CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    18.Beier S, Bertilsson S. Bacterial chitin degradation-mechanisms and ecophysiological strategies. Front Microbiol. 2013;4:1–12.
    Google Scholar 
    19.Kielak AM, Cretoiu MS, Semenov AV, Sørensen SJ, Van, Elsas JD. Bacterial chitinolytic communities respond to chitin and pH alteration in soil. Appl Environ Microbiol. 2013;79:263–72.CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    20.Weintraub MN, Schimel JP. Seasonal protein dynamics in Alaskan arctic tundra soils. Soil Biol Biochem. 2005;37:1469–75.CAS 

    Google Scholar 
    21.Boer VM, De Winde JH, Pronk JT, Piper MDW. The genome-wide transcriptional responses of Saccharomyces cerevisiae grown on glucose in aerobic chemostat cultures limited for carbon, nitrogen, phosphorus, or sulfur. J Biol Chem. 2003;278:3265–74.CAS 
    PubMed 

    Google Scholar 
    22.Kolkman A, Daran-Lapujade P, Fullaondo A, Olsthoorn MMA, Pronk JT, Slijper M, et al. Proteome analysis of yeast response to various nutrient limitations. Mol Syst Biol. 2006;2:1–16.
    Google Scholar 
    23.Silberbach M, Hüser A, Kalinowski J, Pühler A, Walter B, Krämer R, et al. DNA microarray analysis of the nitrogen starvation response of Corynebacterium glutamicum. J Biotechnol. 2005;119:357–67.CAS 
    PubMed 

    Google Scholar 
    24.Merrick MJ, Edwards RA. Nitrogen control in bacteria. Microbiol Rev. 1995;59:604–22.CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    25.Daebeler A, Abell GCJ, Bodelier PLE, Bodrossy L, Frampton DMF, Hefting MM, et al. Archaeal dominated ammonia-oxidizing communities in Icelandic grassland soils are moderately affected by long-term N fertilization and geothermal heating. Front Microbiol. 2012;3:1–14.
    Google Scholar 
    26.Yeager CM, Kornosky JL, Housman DC, Grote EE, Belnap J, Kuske CR. Diazotrophic community structure and function in two successional stages of biological soil crusts from the colorado plateau and Chihuahuan Desert. Appl Environ Microbiol. 2004;70:973–83.CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    27.Malik AA, Swenson T, Weihe C, Morrison EW, Martiny JBH, Brodie EL, et al. Drought and plant litter chemistry alter microbial gene expression and metabolite production. ISME J. 2020;14:2236–47.CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    28.Tveit A, Schwacke R, Svenning MM, Urich T. Organic carbon transformations in high-Arctic peat soils: Key functions and microorganisms. ISME J. 2013;7:299–311.CAS 
    PubMed 

    Google Scholar 
    29.Geisen S, Tveit AT, Clark IM, Richter A, Svenning MM, Bonkowski M, et al. Metatranscriptomic census of active protists in soils. ISME J. 2015;9:2178–90.CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    30.Urich T, Lanzén A, Qi J, Huson DH, Schleper C, Schuster SC. Simultaneous assessment of soil microbial community structure and function through analysis of the meta-transcriptome. PLoS ONE. 2008;3:1–13.
    Google Scholar 
    31.Kallenbach CM, Frey SD, Grandy AS. Direct evidence for microbial-derived soil organic matter formation and its ecophysiological controls. Nat Commun. 2016;7:1–10.
    Google Scholar 
    32.Walker TWN, Janssens IA, Weedon JT, Sigurdsson BD, Richter A, Peñuelas J, et al. A systemic overreaction to years versus decades of warming in a subarctic grassland ecosystem. Nat Ecol Evol. 2020;4:101–8.PubMed 

    Google Scholar 
    33.Sigurdsson BD, Wallander H, Gunnarsdóttir GE, Richter A, Sigurðsson P, Leblans NIW, et al. Geothermal ecosystems as natural climate change experiments: the ForHot research site in Iceland as a case study. Icelandic Agric Sci. 2016;29:53–71.
    Google Scholar 
    34.Söllinger A, Séneca J, Dahl MB, Prommer J, Verbruggen E, Sigurdsson BD, et al. Downregulation of the microbial protein biosynthesis machinery in response to weeks, years and decades of soil warming. 2021 Research Square preprint. https://doi.org/10.21203/rs.3.rs-132190/v235.Leblans N. Natural gradients in temperature and nitrogen: Iceland represents a unique environment to clarify long-term global change effects on carbon dynamics. Joint doctoral dissertation. Antwerp University and Agricultural University of Iceland, Reykjavik, Iceland; 2016:1–229.36.Angel R, Claus P, Conrad R. Methanogenic archaea are globally ubiquitous in aerated soils and become active under wet anoxic conditions. ISME J. 2012;6:847–62.CAS 
    PubMed 

    Google Scholar 
    37.Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:5–11.
    Google Scholar 
    38.Gillespie CS. Fitting heavy tailed distributions: the poweRlaw Package. J Stat Softw. 2015;64:1–16.
    Google Scholar 
    39.El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, et al. The Pfam protein families database in 2019. Nucleic Acids Res. 2018;47:427–32.
    Google Scholar 
    40.Eddy SR. Accelerated profile HMM searches. PLOS Comput Biol. 2011;7:e1002195.CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    41.Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods. 2011;8:785–6.CAS 
    PubMed 

    Google Scholar 
    42.Bendtsen JD, Kiemer L, Fausbøll A, Brunak S. Non-classical protein secretion in bacteria. BMC Microbiol. 2005;5:1–13.
    Google Scholar 
    43.Yu NY, Wagner JR, Laird MR, Melli G, Rey S, Lo R, et al. PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. Bioinformatics. 2010;26:1608–15.CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    44.Orsi WD. MetaProt: an integrated database of predicted proteins for improved annotation of metaomic datasets. Open Data LMU. 2020. https://doi.org/10.5282/ubm/data.18345.Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM, Henrissat B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 2013;42:490–5.
    Google Scholar 
    46.Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 2014;12:59–60.PubMed 

    Google Scholar 
    47.Oksanen AJ, Blanchet FG, Kindt R, Legen- P, Minchin PR, Hara RBO, et al. vegan: Community Ecology Package. 2019. https://cran.r-project.org/package=vegan48.Lê S, Josse J, Husson F. FactoMineR: an R package for multivariate analysis. J Stat Softw. 2008;25:1–18.
    Google Scholar 
    49.Kolde R. pheatmap: pretty heatmaps. 2019. https://cran.r-project.org/package=pheatmap50.Noll L, Zhang S, Zheng Q, Hu Y, Wanek W. Wide-spread limitation of soil organic nitrogen transformations by substrate availability and not by extracellular enzyme content. Soil Biol Biochem. 2019;133:37–49.CAS 
    PubMed 
    PubMed Central 

    Google Scholar 
    51.Schimel JP, Bennett J. Nitrogen mineralization: challenges of a changing paradigm. Ecology. 2004;85:591–602.
    Google Scholar 
    52.Wild B, Ambus P, Reinsch S, Richter A. Resistance of soil protein depolymerization rates to eight years of elevated CO2, warming, and summer drought in a temperate heathland. Biogeochemistry. 2018;140:255–67.CAS 

    Google Scholar 
    53.Wanek W, Mooshammer M, Blöchl A, Hanreich A, Richter A. Determination of gross rates of amino acid production and immobilization in decomposing leaf litter by a novel 15N isotope pool dilution technique. Soil Biol Biochem. 2010;42:1293–302.CAS 

    Google Scholar 
    54.Liang C, Schimel JP, Jastrow JD. The importance of anabolism in microbial control over soil carbon storage. Nat Microbiol. 2017;2:1–6.
    Google Scholar 
    55.Vranova V, Rejsek K, Formanek P. Proteolytic activity in soil: a review. Appl Soil Ecol. 2013;70:23–32.
    Google Scholar 
    56.Schimel JP, Weintraub MN. The implications of exoenzyme activity on microbial carbon and nitrogen limitation in soil: a theoretical model. Soil Biol Biochem. 2003;35:549–63.CAS 

    Google Scholar 
    57.Rawlings ND, Waller M, Barrett AJ, Bateman A. MEROPS: The database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 2014;42:503–9.
    Google Scholar 
    58.Vollmer W, Joris B, Charlier P, Foster S. Bacterial peptidoglycan (murein) hydrolases. FEMS Microbiol Rev. 2008;32:259–86.CAS 
    PubMed 

    Google Scholar 
    59.Vermassen A, Leroy S, Talon R, Provot C, Popowska M, Desvaux M. Cell wall hydrolases in bacteria: Insight on the diversity of cell wall amidases, glycosidases and peptidases toward peptidoglycan. Front Microbiol. 2019;10:1–27.
    Google Scholar 
    60.Donhauser J, Qi W, Bergk-Pinto B, Frey B. High temperatures enhance the microbial genetic potential to recycle C and N from necromass in high-mountain soils. Glob Chang Biol. 2020;27:1365–86.61.Vollmer W, Blanot D, De Pedro MA. Peptidoglycan structure and architecture. FEMS Microbiology Reviews. 2008;32:149–67.CAS 
    PubMed 

    Google Scholar 
    62.Semchenko M, Leff JW, Lozano YM, Saar S, Davison J, Wilkinson A, et al. Fungal diversity regulates plant-soil feedbacks in temperate grassland. Science Adv. 2018;4.63.Saary P, Mitchell AL, Finn RD. Estimating the quality of eukaryotic genomes recovered from metagenomic analysis. Genome Biol. 2020;21:244.PubMed 
    PubMed Central 

    Google Scholar  More

  • in

    Climate and land-use changes reduce the benefits of terrestrial protected areas

    1.Watson, J. E. M., Dudley, N., Segan, D. B. & Hockings, M. The performance and potential of protected areas. Nature 515, 67–73 (2014).CAS 

    Google Scholar 
    2.Juffe-Bignoli, D. et al. Protected Planet Report 2014 (UNEP-WCMC, 2014).3.Gray, C. L. et al. Local biodiversity is higher inside than outside terrestrial protected areas worldwide. Nat. Commun. 7, 12306 (2016).4.Xu, W. et al. Strengthening protected areas for biodiversity and ecosystem services in China. Proc. Natl Acad. Sci. USA 114, 1601–1606 (2017).CAS 

    Google Scholar 
    5.Naidoo, R. et al. Evaluating the impacts of protected areas on human well-being across the developing world. Sci. Adv. 5, eaav3006 (2019).CAS 

    Google Scholar 
    6.Geldmann, J. et al. Effectiveness of terrestrial protected areas in reducing habitat loss and population declines. Biol. Conserv. 161, 230–238 (2013).
    Google Scholar 
    7.Cazalis, V. et al. Effectiveness of protected areas in conserving tropical forest birds. Nat. Commun. 11, 4461 (2020).8.Elsen, P. R., Monahan, W. B., Dougherty, E. R. & Merenlender, A. M. Keeping pace with climate change in global terrestrial protected areas. Sci. Adv. 6, eaay0814 (2020).
    Google Scholar 
    9.Hoffmann, S., Irl, S. D. H. & Beierkuhnlein, C. Predicted climate shifts within terrestrial protected areas worldwide. Nat. Commun. 10, 4787 (2019).10.Batllori, E., Parisien, M. A., Parks, S. A., Moritz, M. A. & Miller, C. Potential relocation of climatic environments suggests high rates of climate displacement within the North American protection network. Glob. Change Biol. 23, 3219–3230 (2017).
    Google Scholar 
    11.Ward, M. et al. Just ten percent of the global terrestrial protected area network is structurally connected via intact land. Nat. Commun. 11, 4563 (2020).CAS 

    Google Scholar 
    12.Jones, K. R. et al. One-third of global protected land is under intense human pressure. Science 360, 788–791 (2018).CAS 

    Google Scholar 
    13.Parks, S. A., Carroll, C., Dobrowski, S. Z. & Allred, B. W. Human land uses reduce climate connectivity across North America. Glob. Change Biol. 26, 2944–2955 (2020).
    Google Scholar 
    14.McGuire, J. L., Lawler, J. J., McRae, B. H., Nuñez, T. A. & Theobald, D. M. Achieving climate connectivity in a fragmented landscape. Proc. Natl Acad. Sci. USA 113, 7195–7200 (2016).CAS 

    Google Scholar 
    15.Watson, J. E. M., Iwamura, T. & Butt, N. Mapping vulnerability and conservation adaptation strategies under climate change. Nat. Clim. Change 3, 989–994 (2013).
    Google Scholar 
    16.Pecl, G. T. et al. Biodiversity redistribution under climate change: impacts on ecosystems and human well-being. Science 355, eaai9214 (2017).
    Google Scholar 
    17.Jones, C., Giorgi, F. & Asrar, G. The coordinated regional downscaling experiment: CORDEX–an international downscaling link to CMIP5. CLIVAR Exch. 16, 34–40 (2011).
    Google Scholar 
    18.Hurtt, G. C. et al. Harmonization of global land-use change and management for the period 850-2100 (LUH2) for CMIP6. Geosci. Model Dev. 13, 5425–5464 (2020).CAS 

    Google Scholar 
    19.Loarie, S. R. et al. The velocity of climate change. Nature 462, 1052–1055 (2009).CAS 

    Google Scholar 
    20.Ordonez, A., Martinuzzi, S., Radeloff, V. C. & Williams, J. W. Combined speeds of climate and land-use change of the conterminous US until 2050. Nat. Clim. Change 4, 811–816 (2014).
    Google Scholar 
    21.UN General Assembly Resolution A/RES/70/1 (UN, 2015).22.Harrop, S. R. ‘Living in harmony with nature’? Outcomes of the 2010 Nagoya conference of the convention on biological diversity. J. Environ. Law 23, 117–128 (2011).
    Google Scholar 
    23.Maxwell, S. L. et al. Area-based conservation in the twenty-first century. Nature 586, 217–227 (2020).CAS 

    Google Scholar 
    24.Schloss, C. A., Nuñez, T. A. & Lawler, J. J. Dispersal will limit ability of mammals to track climate change in the Western Hemisphere. Proc. Natl Acad. Sci. USA 109, 8606–8611 (2012).CAS 

    Google Scholar 
    25.Chen, I. C., Hill, J. K., Ohlemüller, R., Roy, D. B. & Thomas, C. D. Rapid range shifts of species associated with high levels of climate warming. Science 333, 1024–1026 (2011).CAS 

    Google Scholar 
    26.Schwalm, C. R., Glendon, S. & Duffy, P. B. RCP8.5 tracks cumulative CO2 emissions. Proc. Natl Acad. Sci. USA 117, 19656–19657 (2020).CAS 

    Google Scholar 
    27.Ando, A. W. & Mallory, M. L. Optimal portfolio design to reduce climate-related conservation uncertainty in the Prairie Pothole Region. Proc. Natl Acad. Sci. USA 109, 6484–6489 (2012).CAS 

    Google Scholar 
    28.Ackerly, D. D. et al. The geography of climate change: implications for conservation biogeography. Divers. Distrib. 16, 476–487 (2010).
    Google Scholar 
    29.Dobrowski, S. Z. & Parks, S. A. Climate change velocity underestimates climate change exposure in mountainous regions. Nat. Commun. 7, 12349 (2016).30.Hoegh-Guldberg, O. et al. in Special Report on Global Warming of 1.5°C (eds Masson-Delmotte, V. et al.) 175–311 (IPCC, WMO, 2018).31.Sandel, B. et al. The influence of late Quaternary climate-change velocity on species endemism. Science 334, 660–664 (2011).CAS 

    Google Scholar 
    32.Ordonez, A., Williams, J. W. & Svenning, J.-C. Mapping climatic mechanisms likely to favour the emergence of novel communities. Nat. Clim. Change 6, 1104–1109 (2016).
    Google Scholar 
    33.Carroll, C. et al. Scale-dependent complementarity of climatic velocity and environmental diversity for identifying priority areas for conservation under climate change. Glob. Change Biol. 23, 4508–4520 (2017).
    Google Scholar 
    34.Alexander, J. M. et al. Lags in the response of mountain plant communities to climate change. Glob. Change Biol. 24, 563–579 (2018).
    Google Scholar 
    35.Lawler, J. J. et al. Projected land-use change impacts on ecosystem services in the United States. Proc. Natl Acad. Sci. USA 111, 7492–7497 (2014).CAS 

    Google Scholar 
    36.Stein, B. A. et al. Preparing for and managing change: climate adaptation for biodiversity and ecosystems. Front. Ecol. Environ. 11, 502–510 (2013).
    Google Scholar 
    37.Elsen, P. R., Monahan, W. B. & Merenlender, A. M. Global patterns of protection of elevational gradients in mountain ranges. Proc. Natl Acad. Sci. USA 115, 6004–6009 (2018).CAS 

    Google Scholar 
    38.Burrows, M. T. et al. The pace of shifting climate in marine and terrestrial ecosystems. Science 334, 652–655 (2011).CAS 

    Google Scholar 
    39.Burrows, M. T. et al. Geographical limits to species-range shifts are suggested by climate velocity. Nature 507, 492–495 (2014).CAS 

    Google Scholar 
    40.Fitzpatrick, M. C., Gove, A. D., Sanders, N. & Dunn, R. R. Climate change, plant migration, and range collapse in a global biodiversity hotspot: the Banksia (Proteaceae) of Western Australia. Glob. Change Biol. 14, 1337–1352 (2008).
    Google Scholar 
    41.Dynesius, M. & Jansson, R. Evolutionary consequences of changes in species’ geographical distributions driven by Milankovitch climate oscillations. Proc. Natl Acad. Sci. USA 97, 9115–9120 (2000).CAS 

    Google Scholar 
    42.Geldmann, J., Manica, A., Burgess, N. D., Coad, L. & Balmford, A. A global-level assessment of the effectiveness of protected areas at resisting anthropogenic pressures. Proc. Natl Acad. Sci. USA 116, 23209–23215 (2019).CAS 

    Google Scholar 
    43.Tittensor, D. P. et al. Integrating climate adaptation and biodiversity conservation in the global ocean. Sci. Adv. 5, eaay9969 (2019).
    Google Scholar 
    44.Osorio, F., Vallejos, R. & Cuevas, F. SpatialPack: Package for Analysis of Spatial Data. R package version 0.2-3 (2014).45.Williams, K. D. et al. The Met Office Global Coupled model 2.0 (GC2) configuration. Geosci. Model Dev. 8, 1509–1524 (2015).
    Google Scholar 
    46.Giorgetta, M. A. et al. Climate and carbon cycle changes from 1850 to 2100 in MPI-ESM simulations for the Coupled Model Intercomparison Project Phase 5. J. Adv. Model. Earth Syst. https://doi.org/10.1002/jame.20038 (2013).47.Knudsen, E. M. & Walsh, J. E. Northern Hemisphere storminess in the Norwegian Earth System Model (NorESM1-M). Geosci. Model Dev. 9, 2335–2355 (2016).
    Google Scholar 
    48.Brito-Morales, I. et al. Climate velocity can inform conservation in a warming world. Trends Ecol. Evol. 33, 441–457 (2018).
    Google Scholar 
    49.García Molinos, J., Schoeman, D. S., Brown, C. J. & Burrows, M. T. VoCC: an R package for calculating the velocity of climate change and related climatic metrics. Methods Ecol. Evol. 10, 2195–2202 (2019).
    Google Scholar 
    50.UNEP‐WCMC & IUCN Protected Planet: The World Database on Protected Areas (WDPA, 2018).51.Visconti, P. et al. Protected area targets post-2020. Science 364, eaav6886 (2019).
    Google Scholar 
    52.Farr, T. G. et al. The shuttle radar topography mission. Rev. Geophys. https://doi.org/10.1029/2005RG000183 (2007).53.Olson, D. M. et al. Terrestrial ecoregions of the world: a new map of life on Earth. BioScience 51, 933–938 (2001).54.Ellis, E. C., Antill, E. C. & Kreft, H. All is not loss: plant biodiversity in the anthropocene. PLoS ONE 7, 30535 (2012).55.Asamoah, E. F. Climate Velocity and Land-use Instability 1971–2100 (Figshare, 2021); https://doi.org/10.6084/m9.figshare.14852955.v4 More