in

Human activities favour prolific life histories in both traded and introduced vertebrates

Data collection

We obtained trade data from two different sources: the United States Fish and Wildlife Service (USFWS) Law Enforcement Management Information System (LEMIS)31 and the International Union for Conservation of Nature (IUCN) Red List32. We used the former to obtain data on the live wildlife trade in general and the latter for data on the pet trade specifically. We then matched trade data with our previously compiled global scale datasets of life history traits and introductions in mammals, reptiles and amphibians25,26.

We obtained data on the US live wildlife trade from LEMIS by a Freedom of Information Act Request on 12/08/2019. We requested summary data on all US imports and exports of wildlife across all available years (1999-2019) and all trade purposes, including information on species identities and shipment contents (e.g. live individuals, meat, skins, etc.). For each species, we summed the total number of recorded shipments of live individuals (including individuals that died in transit, and live eggs) as a measure of trade frequency. We classified species as in trade if there was at least one shipment of live individuals recorded in the LEMIS database, and as not traded otherwise. The LEMIS dataset is geographically limited to trade by the US, and therefore may not capture the full diversity of species involved in the wildlife trade. For example, the LEMIS database may be missing some species involved in the substantial trade in live wildlife between South–East Asian countries50. However, the US represents one of the most dominant players in the global market for live wildlife16, and by summing both imports and exports we capture demand for species in countries beyond the US to some extent. Supplementary Fig. 2 illustrates the frequency of trade between the US and countries represented in the US LEMIS dataset. LEMIS data should be considered a minimum estimate of the diversity of species involved in the wildlife trade since they mostly record only legal trade (although confiscated shipments are recorded), and shipments are sometimes not identified to the species level16,51,53,53. The LEMIS database also contains some mis-spelled and incorrectly identified species due to human input errors52. To minimise the effect of misidentified shipments on our species level classifications of US trade status, we discarded all LEMIS records that were not identified to the species level (i.e. those identified using genus, common or generic names only), and manually checked the LEMIS data for synonyms and alternate spellings when we could not automatically match any records in LEMIS with species in our life history datasets. Species classified as traded on the basis of a single recorded live shipment in LEMIS are most vulnerable to species level misclassification due to misidentified shipments. The vast majority of traded species have multiple shipments recorded in LEMIS (259/312 [83%] of traded mammals, 265/285 [93%] of traded reptiles and 72/75 [96%] of traded amphibians), reducing the potential impact of shipment level misidentification over the reliability of species level trade classifications. However, to investigate the robustness of our findings to possible errors in species identification in LEMIS, we re-ran our key analyses excluding species classified as traded on the basis of a single live shipment. We found qualitatively the same effects of life history traits on the probability of trade when removing these species as in our full sample (Supplementary Tables 25–27). Despite its limitations, LEMIS is an invaluable resource for identifying broad scale trends in the wildlife trade since few other countries maintain such detailed records, and it is the only large-scale international trade dataset that includes both CITES- and non-CITES-listed species16,41. Including non-CITES listed species in our analyses is important because CITES-listed species represent only a small minority of those in trade14 and are likely to be a biased sample in terms of life history traits, since species vulnerable to extinction typically have slower life histories40.

We obtained separate data on the pet trade from the IUCN Red List. The IUCN has assessed the vast majority of mammal, reptile and amphibian species (91%, 79% and 86% respectively54). Here, we classified a species as involved in the pet trade if the IUCN species account included at least one clear description of involvement in the pet trade. Otherwise, we considered a species as not involved in the pet trade. Although LEMIS records the purpose of trade, it uses broad categories (e.g. ‘Commercial’, ‘Personal’, ‘Breeding in captivity’), none of which refers specifically to nor necessarily equates to trade for pets. Therefore, we sought this additional data on the pet trade from the IUCN Red List instead of following the approach of some previous studies which have used LEMIS data as a proxy for the pet trade (e.g. Refs. 15,19). In contrast, the IUCN Red List contains clear textual descriptions of use and trade for many species, allowing us to identify which species are traded specifically for pets32. The IUCN data has further complementary strengths compared with LEMIS in that it is global in scope and includes both legal and illegal trade. We obtained data from the IUCN Red List by manually searching the binomial name of each species in our samples and consulting the ‘Threats’ and ‘Use and Trade’ sections of the species accounts. We classified species as in the pet trade if the information clearly stated this was the case (e.g. “It has been recorded in the pet trade”, “This species appears in the international pet trade”). We discounted descriptions where the information was uncertain (e.g. the species is described as “probably” or “possibly” traded for pets). We did not count as pets those species that the IUCN categorises as used for “Pets/display animals, horticulture” but which are used only for zoos or captive display, such as beluga whales (Delphinapterus leucas). All species described as pets by the IUCN are ‘exotic’, i.e. those without a long history of domestication14, since the IUCN does not list domesticated species.

We matched trade data with our previously published global scale datasets on life history traits and introductions25,26. Internationally traded species may or not be released in the wild outside their native range: some may remain in the confines of captivity (e.g. in zoos or kept by private owners). We defined a species as introduced if there was at least one reliable record of its release, by humans, into the wild outside of its native range, either accidentally or intentionally25,26. We included only species with complete data for the same life history traits as used in our prior analyses (mammals: body mass, gestation period, weaning age, neonatal body mass, litter size, litters per year, age at first reproduction and reproductive lifespan; reptiles: body mass, hatchling mass, clutch size, clutches per year, age of sexual maturity, reproductive lifespan and parity; amphibians: snout-vent length, egg size, clutch size, age of sexual maturity and reproductive lifespan) to facilitate direct comparisons with previous results and to allow us to account for covariation between life history traits55. Species with complete life history data represent 7.8%, 3.5% and 1.6% of the total estimated number of species of mammals, reptiles and amphibians respectively56,57,58. These samples are not random as they over-represent orders containing many species of interest and utility to humans (e.g. ungulates, primates, crocodilians) (Supplementary Tables 28–30). However, these biases are unlikely to undermine our results since we examine life history effects on trade and introduction within these samples. Trade and introduction data do not necessarily cover the same time periods: the US dataset covers only the years 1999-present and the IUCN descriptions also typically refer to recent trade. In contrast, our introduction dataset includes both historical and recent introductions25,26. Therefore, the goal of our analyses is not to test causal hypotheses on the direct relationship between trade and introduction but rather to investigate whether the same life history traits predispose species towards both trade and introduction across diverse taxa, locations and circumstances. When combining the datasets and phylogenies59,60,61,62,63, we resolved species name mis-matches by referring to taxonomic information from the IUCN Red List32, the Global Biodiversity Information Facility (GBIF33) and the Integrated Taxonomic Information System (ITIS64). Table 1 summarises final sample sizes and Supplementary Table 1 the degree of overlap between the trade datasets. Most species in the pet trade are also in the general live wildlife trade, but many more species are traded by the US for general purposes than are involved in the pet trade specifically.

Finally, we obtained data for a proxy measure of species detectability in order to control for a potential confounding effect on relationships between life history traits and introduction: larger bodied and longer-lived species may be more likely to be recorded by human observers when introduced compared with smaller and shorter-lived species. We obtained data on species occurrence records, geographic range size and population density, assuming that highly detectable species will have a disproportionately large number of recorded observations than expected based on the size of their geographic ranges and average population densities, following similar approaches by e.g. Refs. 65,66. We obtained occurrence records from the Global Biodiversity Information Facility (GBIF33) via the R package rgbif67 selecting only records resulting from human observation. We obtained range sizes (in decimal degrees squared) from the IUCN Red List32 and processed them for analysis using functions from the rgdal package68, excluding areas of uncertain presence (i.e. limiting range to presence code 1, ‘extant’). We obtained population density estimates from the TetraDENSITY database (version 134), a global database of population density estimates for terrestrial vertebrates. Most species in the TetraDENSITY dataset are represented by estimates from multiple different studies (median = 3, range 1–408). We collapsed density estimates to the species level by taking the median value across studies, including all estimates regardless of sampling method to maximise sample size, and converting all units to individuals/km2 to ensure comparability.

Statistical analyses

To investigate relationships between life history traits and trade, we run models treating US or pet trade as the outcome variable and life history traits as the predictors. For all analyses, all life history variables were included in the same models to account for covariation among life history traits55. For US trade, where data on trade frequency are available, we run models both in which trade is treated as a binary variable (traded vs. not traded) and as a count variable (frequency of live shipments, including zero values), while for the pet trade, we have no data on trade frequency and so we treat pet trade as a binary variable only. To investigate the effects of life history traits on introduction, we run models in which introduction is the outcome variable and life history traits are the predictors. In introduction models, we only include traded species (running separate models for the set of species in US trade and the set of species in the pet trade). This approach allows us to disentangle effects associated with trade and introduction and thus identify at which stage(s) life history biases emerge. We also run introduction models in which frequency of US trade is included as an additional predictor alongside life history traits, anticipating that highly traded species are more likely to be introduced. Finally, to investigate possible confounding effects of species detectability on relationships between life history traits and introduction, we investigate effects of number of observations, geographic range size and, where sample sizes allowed, population density on the probability of introduction. If highly detectable species are more likely to be recorded as introduced, we expect to find a positive effect of the number of observations (while accounting for geographic range size and population density) on the probability of introduction. If this effect confounds relationships between body mass/lifespan and introduction, the effect of these life history traits on the probability of introduction should disappear when detectability measures are included in the models alongside life history traits. All analyses were conducted using the R statistical programming environment (Version 4.2.069). Plots were coloured using palettes from the viridis package70.

To estimate effects of predictor variables, we fit generalized linear mixed models (GLMMs) using Markov chain Monte-Carlo (MCMC) estimation, implemented in the MCMCglmm package35,36. For analyses with binary outcome variables (traded vs. not traded, introduced vs. not introduced) we fit probit models, while for analyses with US trade frequency as the outcome variable we fit hurdle models. Hurdle models estimate two latent variables: the probability that the outcome is zero (on the logit scale), and the probability of the outcome modelled as a Poisson distribution for non-zero values71. This method therefore allows us to estimate effects of life history traits on the probability and frequency of trade in the same model. While the binary component of a hurdle model estimates the probability that outcomes are zero, when reporting results we reverse the sign of coefficients from the binary model for ease of interpretation, so that effects can be interpreted as the probability that the outcome is not zero. Therefore, here predictors with consistent effects on the probability and frequency of trade in hurdle models will have the same sign (so that if, for example, litter size has a positive effect on both the probability and frequency of trade, both coefficients for litter size from the hurdle model will be positive).

Datasets comprising biological measures from multiple related species violate the fundamental statistical assumption that observations are independent of one another, since closely related species are more phenotypically similar than expected by chance due to their shared evolutionary history72. To account for the non-independence of species due to shared ancestry, we included a phylogenetic random effect in all models, represented by a variance-covariance (VCV) matrix derived from the phylogeny. The off-diagonal elements of the VCV matrix contain the amount of shared evolutionary history for each pair of species35,37,38 based on the branch lengths of the phylogeny (here proportional to time)59,61,62,63,63. This approach allows us to estimate phylogenetic signal using the heritability (H2) parameter, which measures the proportion of total variance in the latent variable attributable to the phylogeny35,37,38. Heritability is interpreted in the same way as Pagel’s λ in phylogenetic generalized least squares regression35,37,38,72. Specifically, phylogenetic signal is constrained between 0, indicating no phylogenetic effect so that species can be treated as independent, and 1, indicating that similarity between species is directly proportional to their amount of shared evolutionary history35,38,72. As hurdle models estimate two latent variables, for each hurdle model we report two heritability estimates, one for the binary and one for the Poisson component. All continuous independent variables were log-10 transformed due to positively skewed distributions. Although GLMMs do not require normally distributed predictor variables, log-transforming positively skewed life history predictors in phylogenetic comparative analyses allows us to model life history evolution on proportional rather than absolute scales. This is important as it facilitates biologically meaningful comparisons between species across large scales of life history variation73. Further, log-transforming positively skewed predictors helps to meet assumptions of the underlying Brownian motion model of evolutionary change, which assumes that phenotypic change along branches of the phylogeny is normally distributed74.

We calculated variance inflation factors (VIFs) using functions from the car R package75 to check for multicollinearity between predictor variables. Where any model reported a variance inflation factor of 5 or above, indicating potentially problematic levels of collinearity76, we re-ran the model removing the variable with the highest VIF iteratively until all the remaining variables had VIFs of <5. We also illustrate the correlations between the life history variables included in our models in Supplementary Figs. 3–5, which suggest evidence for both classic fast-slow life history trade-offs (e.g. smaller, less frequent litters in larger, longer-lived mammal species) and more complex patterns (e.g. larger clutches in larger-bodied, longer-lived reptile species). For each model, we report the mean estimates from posterior distributions for all parameters, and the percentage of fixed effect parameter estimates crossing zero in the direction opposite to the majority of estimates, as a measure of the strength of evidence for individual fixed effects in a specific direction. We used default, diffuse normal prior probability distributions for the fixed effects (mean = 0, variance = 1010). For the phylogenetic random effect in probit models, we used a chi-squared prior distribution which better approximates a uniform prior compared with more commonly used inverse-Gamma priors37 (with V = 1, ν = 1000, αμ = 0, αV = 171). The residual variance is fixed to 1 since models with binary dependent variables do not provide sufficient information to estimate residual variance (following37). For the binary component of the hurdle model, we used the same chi-squared prior for the phylogenetic random effect and fixed prior for the residual variance as we used in the probit model. For the Poisson component of the hurdle model, we used commonly implemented inverse-Wishart priors (with V = 1 and ν = 0.002, equivalent to inverse-gamma distributions with shape and scale = 0.00171) for the phylogenetic random effect and the residual variance. By modelling residual variance separately, MCMCglmm accounts for over-dispersion in the distribution of the non-zero response values71, which is common in count data. We ran each model for a sufficient number of iterations to obtain effective sample sizes of at least 1000 for all parameters (5,010,000 iterations, with a burn-in period of 10,000 iterations, sampling every 1000 generations). Model convergence was also confirmed by visual examination of posterior distributions and chain plots.

Finally, we assessed the ability of our models to predict the probability of trade and introduction for species within and outside the sample, based on both fixed effects (life history traits and body mass) and the phylogenetic random effect. For out-of-sample predictions, we used leave-one-out cross-validation (LOOCV), i.e. we re-ran the model excluding each species in turn, obtained predictions for the missing species and compared these with the observed values. For both types of predictions,  we calculated the area under the curve (AUC) of the receiver operating characteristic (ROC) curve as a measure of classification performance, using the cvAUC package77. AUC values indicate the probability that a randomly drawn positive observation (in this case, a species that is traded or introduced) has a predicted value that is greater than a randomly drawn negative observation (i.e. a species that is not traded or introduced)39. AUCs vary from 0.5 to 1, where 0.5 indicates that the model predictions are no better than random guesses, and 1 indicates perfect distinction (i.e. positive observations always have greater predicted probabilities than negative observations)39. Next, we used predicted values from the model to identify species not listed as traded or introduced in our datasets, but with high predicted probabilities of trade or introduction which may indicate high risk of future trade or introduction. To do so, we extracted predicted probabilities from models and identified non-traded or non-introduced species with high predicted probabilities for each vertebrate class. LOOCV required re-running each of our models as many times as the N species in the sample, which necessitated shorter MCMC chains to avoid impractically long run-times. In supplementary analyses, we show that predictions do not differ between models based on short versus long chains (Supplementary Table 31).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.


Source: Ecology - nature.com

Modelling the impact of non-pharmaceutical interventions on the spread of COVID-19 in Saudi Arabia

Water masses shape pico-nano eukaryotic communities of the Weddell Sea