Plant defence traits
We compiled species level data for five plant traits: wood density (WD), leaf and stem spinescence, latex production, and leaf size, for tropical and extra-tropical South and Central American woody species (i.e., the Neotropical biogeographic realm). WD was obtained for 2577 species from ref. 44. We only used wood density data from Zanne et al.44, because this study used WD measured in stems, whereas most other studies with available data used WD measured in branches. Leaf size data were obtained for 2660 woody species from Wright et al.37. We did not include leaf size from herbaceous species because herbaceous and woody species are influence by different megafauna guilds, suggesting distinct mechanisms, and because this dataset37 only included data for 253 Neotropical herbaceous species. The presence or absence of stem (and/or branch) spines (mostly thorns, but also prickles) were obtained from Dantas and Pausas45 for Neotropical savanna and forest species (1004 species) and complemented with other literature sources for other ecoregions (listed in the supplementary materials) using the names of the species for which we had WD and Leaf Size data. Our final stem spines dataset included 2843 woody species. We also compiled data on the presence of latex in plant stems and leaves for all the species for which we had data on other traits (3160 species; references in the supplementary materials). Finally, we also compiled data on leaf spines. While we managed to find leaf spine data for a total of 2173 woody species, we found spinescence in leaves to be especially concentrated in the palm Family (Arecaceae; 198 out of 221 species with leaf spines). Moreover, out of the non-palm species, all but three species also presented stem spines, indicating that, for other taxa, leaf spines might be dependent on the presence of stem spines at the region (in palms, 51% have stem spines). Thus, we only used leaf spinescence data of palm species (694 species) from the global Palm Traits Database 1.046.
For wood density and leaf size, we often had more than one trait value per species (1005 and 831 species with more than one trait value, respectively). Thus, we computed the species mean trait value. This rarely occurred for binary traits (spinescence and latex) and, when occurred, the maximum value was used (0 for absence and 1 for presence). This later decision was based in the assumption that omitting the presence of spines or latex is more likely than incorrectly reporting the presence when it is absent. Moreover, some of these traits can be plastic18.
From species to ecoregions
We searched for geographical distribution data (coordinates) from the Global Biodiversity Information Facility (GBIF) for all of the species in each species-trait dataset (Data available from GBIF using the following doi: WD: https://doi.org/10.15468/dl.3vua3x; Stem spines: https://doi.org/10.15468/dl.ar5ddj; Latex: https://doi.org/10.15468/dl.m8dzjd; Leaf spines: https://doi.org/10.15468/dl.vv8gw4; Leaf size: https://doi.org/10.15468/dl.k98nxc). For this search, we used tools provided by the “rgbif” package for R in which species names are updated to the most recent classification and the returned occurrences also include those associated with synonyms (i.e., the “backbone” method). We labelled the obtained geographical coordinates according to their ecoregion and biogeographical realm (following Dinerstein et al.47) and cropped out occurrences falling outside of the Neotropical realm. Since occurrence data was not available to all the species in our initial trait dataset, the number of species used to calculate ecoregion level means was reduced to 2110 species, for wood density, 2133, for leaf size, 2629, for stem spines, 2714, for latex, and 657, for leaf spines. A detailed evaluation of the representativity of this data in relation to ecoregion- and Neotropical- level patterns can be found in the Supplementary Methods. Based on the occurrence data and their ecoregion label, we built a species abundance (columns) by ecoregion (rows) matrix for each trait.
We obtained ecoregion scale abundance-weighted means for continuous traits (WD and Leaf Size) by: (1) Multiplying species abundance in each grid cell of the ecoregion by the mean species value; (2) Summing up the row values; (3) dividing the resulting row sum by the total species abundance (row sum prior to trait multiplication), and (4) calculating the ecoregions’ means (across all of the grid cells). For Stem Spines and Latex (binary traits), we used a similar procedure, but the maximum (0 for absence and 1 for presence) value was used instead of the mean in step (1), and step (2) was directly used to calculate the number of presences (i.e., 1 s). Moreover, instead of the steps (3) and (4), we calculated the number of absences as the difference between the total abundance (row sums before trait multiplication) and the values obtained in step (2). This process resulted in weighted means for WD and stem spinescence for 173 ecoregions, and Leaf Size and Latex for 174, out of the 179 Neotropical ecoregions. For leaf spinescence, we used a similar approach, although, because of the fewer species, the abundance estimate from GBIF was less reliable. Thus, we transformed the ecoregion species abundance to presence/absence before multiplying the trait values (0/1 for absence/presence). We obtained leaf spinescence data for 159 out of the 179 Neotropical ecoregions. The species- and ecoregion- level data is provided in the Supplementary Data and in ref. 47.
Historical megafauna distribution
We obtained data on historical distribution of megafauna species from the MegaPast2Future/PHYLACINE_1.2 dataset24, a dataset containing distribution maps (96.5 km of spatial resolution) and functional traits for mammal species of the last 130,000 years. From this dataset, we obtained the probable past distribution of extinct large mammal herbivore (hereafter, “megafauna”) species, if these species were still alive today (“Present Natural” scenario; see details below). The “Present Natural” distribution of extinct species in this database is based on the estimated historical distribution (i.e., preceding anthropogenic range modifications) of extant species that are known (from the fossil record) to have coexisted with the extinct species. In this approach, an extinct species is considered to have been present in a given grid cell if at least 50% of the extant species that were found coexisting with the extinct species in the fossil (and subfossil) record was predicted to have occurred in the same cell prior to anthropogenic range modifications24,48. This approach assumes that, since extant and extinct species coexisted in the same locations, they must have had similar ecological requirements. It also assumes that megafauna extinction had anthropogenic causes, instead of causes related to climate change49, which is largely accepted in the literature50.
We extracted the “Present Natural” distribution of extinct mammal (coded “EP” for IUCN status; i.e., “extinct in prehistory”, meaning before 1500 CE) whose body mass was higher than 50 kg (megafauna), and for which at least 90% of their diet consisted of plants (i.e., strict herbivores). For each Ecoregion, we began by calculating two megafauna-related metrics: extinct megafauna species richness (Mrich) and their mean body mass (Mbm). For this, we cropped the distribution maps of the megafauna species (containing 1 for presence and 0 for absence of each species) to the Neotropical realm. To calculate Mrich, we (1) counted species presences within each of the grid cells in the global grid (i.e., calculated the cell’s megafauna richness); (2) assigned the corresponding ecoregion label to the resulting richness grid cells, subset the richness cell values corresponding to the Neotropical region; and (3) calculated the mean for each Neotropical ecoregion. For Mbm, we replaced the presences of the megafauna species in the initial raster object (grid cell map of each megafauna species) by their body masses and calculated the grid cell-level mean body mass, before calculating the ecoregion-level means. We also calculated megafauna density and secondary productivity based on allometric equations that relate these metrics to megafauna body mass. However, we did not used megafauna density and secondary productivity because they were strongly correlated to megafauna richness (Supplementary Fig. 3). More details on how these metrics were calculated can be found in the Supplementary Methods.
We also obtained diet preference information from the literature for most megafauna species that occurred in the Neotropical region (details and references in the Supplementary Material). Based on these information, we calculated the richness of large browser (MBrich for megabrowser richness), grazer (MGrich for megagrazer richness), and mixed-feeder (MMfrich for mega mixed-feeder richness) species by sub setting the megafauna species by grid cell array before the richness calculation in order to select only species that were classified within the correspondent subgroup.
Extant herbivore mammal distribution
We also compiled data on the distribution, body mass and diet of extant and recently extinct (i.e., extinct after 1500 CE) herbivore mammal species (for simplicity, called ‘extant’ species in this study). As with megafauna maps, the distributions used represented reconstructions for periods preceding anthropogenic reduction of extant herbivores ranges (“Present Natural” scenario), based on abiotic, biotic and geographic variables48, rather than the currently observed distribution. This scenario was used because modern anthropogenic range reductions are too recent to produce substantial geographic effects at this spatial scale. These data were obtained by sub setting the MegaPast2Future/PHYLACINE_1.2 dataset to exclude species that were coded “EP” for IUCN status and that were not strict herbivores (at least 90% of the diet constituting of plants). We subsequently associated diet information to these species using data from ref. 51 and excluded all species that did not feed mainly on aboveground vegetative plant tissues (i.e., species that fed mostly on fruits, seed, roots were excluded). This later filtering was because the number of herbivores that feed mostly on seed and fruit increase with decreasing size (and this dataset included small mammals). We subsequently calculated the same metrics as for the extinct megafauna species (except for the richness of mixed-feeders as our source for diets50 labelled species according to dominant feeding pattern). For this, we used the same approach described for extinct megafauna species. We did not use a size threshold for extant species because there were only 13 extant mammal herbivore species with over 50 kg in the Neotropical region, most of which were grazers (9 species; 4 species were mixed-feeders and none were browsers). Therefore, we relied on the mean body mass metric calculated for extant mammals to detect potential size-related effects.
Climate, soil, fire, insularity, and hurricanes
For each Ecoregion, we obtained data on climate (mean annual precipitation and temperature, and rainfall seasonality) and soil (sand content, pH, and cation exchange capacity) variables. Climate data was obtained from WorldClim 2.1 (10 min spatial resolution) and was based on climate data from 1970 to 200052. Soil data were obtained from SoilGrids (5 km of spatial resolution)53, and consisted of mean values for two depths, 0.05 and 2 m. We calculated Ecoregion level means for all of the soil and climate variables after intersecting the climate and soil grid maps with the ecoregion map.
We obtained the number (a proxy for frequency) and intensity of wildfires per ecoregion area using the MODIS active fire location product (MCD14ML)54. We only considered fires (i.e., hotspots) with detection confidence of 95% or higher occurring from November 2000 to December 2019 (both included). To ensure that only wildfires were considered, we associated each fire pixel with a land cover type (300 m of spatial resolution) from ref. 55 for a buffer area of 1000 m surrounding the fire pixel centroid. We excluded all of the fires occurring in areas in which more than 10% of the surrounding land cover pixels corresponded to agricultural, urban and water classes. We calculated the number of wildfires per ecoregion area by dividing the fire count of each Ecoregion by the ecoregion area, and multiplying the resulting value by the proportion of vegetated land cover pixels (same classes used to exclude fires in anthropogenic areas and water bodies above). Fire intensity was estimated as the average fire radiative power across all detected MODIS hotspots in the ecoregion. Ecoregions lacking large preserved vegetated areas (criteria above) were excluded from subsequent analyses.
Using the ecoregion map, we also classified ecoregions into insular (1), when most of the ecoregion area was located in islands, vs. continental (0), otherwise. This was performed because island biogeography theory predicts that, in island, species richness should be low due to low colonization and high extinction rates. Insularity has also been shown to reduce megafauna body size (i.e., the island rule), even though the mechanisms are not fully understood56. We also compiled data on hurricane activity, as woody density was suggested to confer resistance against this disturbance57. We used data from 1990 to 2019 from the HURDAT2 dataset58, containing six-hourly information about the location of all of the known tropical and subtropical cyclones (0.1° latitude/longitude). We used the sum of hurricane occurrences per ecoregions divided by ecoregion area as an indicator of hurricane activity.
Statistical analyses
To understand megafauna patterns, we began by fitting (multiple) regression models with habitat-related (fire, climate, soil) and geographical (insularity) variables as predictors. We expected that megafauna richness in general was higher under savanna conditions (arid nutrient-rich or mesic nutrient-poor environments with frequent fires)1,22. We also expected that megafauna richness and body mass were affected negatively by insularity (i.e., following the island biogeography theory and island rule). Before the analyses, we tested the correlations among all of the variables that would eventually be entered as predictors in the same model for both the megafauna and trait models (Supplementary Table 1), in order to avoid multicollinearity associated with highly correlated variables (here, r ≥ 0.60). Since mean annual precipitation and soil pH were strongly positively correlated (r = −0.78), for all of the analyses (including the analyses with functional traits, described below), model selection was performed separately for these two variables (i.e., two different model selection procedures, one containing each of the two variables among the initial set of predictors). We selected the best among the two resulting models as that with the lowest AIC (differences higher than two points in all of the cases). To make sure that no multicollinearity remained we also calculated the Variation Inflation Factor (VIF) for all of the predictor variables as 1/tolerance, where tolerance is calculated as 1 minus the R2 of all of the model regressing a predictive variable against all of the other predictors. In all of the models, VIF was 3.33 or smaller (i.e., a tolerance of 0.30 or higher), indicating absence of multicollinearity.
Model simplification was carried interactively using stepwise (both forward and backward) searching for the model with the lowest AIC (using R’s “step” function) and subsequently retaining only the significant variables (p ≤ 0.05). We calculated the Pearson r statistics as a measure of effect size for the selected variables as well as the associated confidence intervals, using the packages “parameters” and “effectsize” for R. The average contribution of each predictor variable was also calculated, using the package “dominanceanalysis”, as the mean difference in R2 before and after removing the target variable from models containing all of the possible subset combinations of the selected predictor variables, including the full selected model.
For testing whether the studied plant functional traits were related to our megafauna indicators, we fit linear models to WD and leaf size, and generalized linear models (GLM; binomial family) for spinescence and latescence, using ecoregion as the unit. For spinescence and latescence, we used the matrix containing the count of spiny/latex and non-spiny/non-latex plants (species abundance; for stem spines and latex) or number of species with or without spines (for leaf spines; see above) as response variables. The predictor variables included the animal indicators for extinct megafauna and extant herbivores, as well as climate, soil, and fire predictors (and, for WD, hurricane counts). Because total, as well as megagrazer, megabrowser, and mega mixed-feeder species richness were strongly positively correlated (Supplementary Table 1), we used the richness difference between grazers and browsers to evaluate the effect of diet (Supplementary Fig. 1). For consistency, we used the same diet variable for extant and extinct species. Since we did not identify strong correlations among extinct megafauna and extant herbivore indicators (Supplementary Table 1), these variables were all entered simultaneously in the same initial models. As with the analyses of the megafauna indices, we also used r as effect size and calculated the average predictor contribution in terms of R2 for these models. For the later, we used the MacFadden Pseudo-R2 in the GLM models as implemented in the “pscl” and “dominanceanalysis” packages for R, as this statistic is the most comparable with R2 from linear multiple regression (Maximum Likelihood and Cragg and Uhler’s Pseudo-R2 were also calculated for the logistic models), and adjusted R2 for continuous traits. Islands were not included in these models, as island plants were expected to respond differently due to the effects of insularity on animal species richness, precluding megafauna and extant mammal richness from being accurate proxies for consumer abundance. For stem spines, we always included a quadratic term to both megafauna and extant mammal herbivore body mass, as evidence suggest that medium-size herbivores (i.e., approximately 250 kg) are important selective drivers of this trait12. If a significant relationship with our herbivory indicators (both extant and extinct) were significant but not indicative of a selective effect by herbivores (for more defended plants), this relationship was discarded (along with related variables, such as diet); this happened only once, for leaf size, which increased with extant herbivore richness (Supplementary Table 8).
For all of the general linear regression models, assumptions of normality, homoscesticity and lack of spatial autocorrelation in the residuals were checked using the Kolmogorov–Smirnov, Breusch–Pagan and Moran’s I tests, respectively. For the later, ecoregions were considered neighbours when they were adjacent and non-neighbour otherwise. In some cases, heteroscesticity was detected and, thus, the significance of the coefficients was tested using heteroskedasticity-consistent covariance matrix estimation. If one or more variable lost their significance they were stepwise removed from the final model, beginning by the least significant, until all remaining variables had a significant effect. Overdispersion in the generalized linear model was also detected and dealt with using overdispersed binomial logit models, as implemented in the “dispmod” package for R, in which weights are interactively calculated and used to maintain the residual deviance lower than the degrees of freedom. To confirm that the detected associations between megafauna indices and plant traits were robust, we also tested the coefficient significance using randomization of the plant species by ecoregion matrices (see Supplementary Methods for details).
To test the prediction that Neotropical ecoregions could be broadly classified into the three hypothesised antiherbiomes, we used hierarchical clustering on principal component axes of the ecoregion by trait matrix (five plant traits, standardized to zero mean and unit variance). We selected the number of clusters associated with the highest loss of inertia (within group variability) when progressively increasing the number of clusters, using the R package “FactoMineR”. This procedure allowed the recognition of large regions characterised by specific patterns of defence strategies (‘antiherbiomes’). We subsequently tested for axes score, megafauna and environmental differences among the resulting antiherbiomes to verify whether and how trait, climate and soil patterns matched those described for African ecosystems, and to understand the megafaunal differences among the antiherbiomes. For these comparisons, we used Kruskal-Wallis and post-hoc pairwise Dunn tests, using the Benjamini & Hochberg59 (1995) correction of P-values for multiple comparisons in both cases, and exclusively included continental ecoregions. For spines, we used the proportion of spinescent plants/species (rather than the number of “yes” and “no” used on previous analyses) in the principal component analysis. Because palms were missing from 20 ecoregions, we completed the values for these ecoregions using predicted model probabilities. To better understand these associations between traits and the environmental and megafauna variables, we also regressed the PCA axes against the same predictors used for traits.
We also developed a framework to identify forest ecoregions most likely to have experienced a biome shift after megafauna extinction using antiherbiome, biome and megafauna distribution data. Ecoregions likely to have experienced a savanna-to-forest shift since the Pleistocene are those that: (1) are currently forest-dominated; (2) are classified in antiherbiomes analogous to African arid nutrient-rich or mesic nutrient-poor savannas; and (3) were megafauna- and, especially, megagrazer- rich during the Pleistocene (richness equal or greater than the 0.75 quantile: 14 species for Mrich, and 3 for exclusively grazing species; MGrich). We validated the distribution of these areas with fossil evidence (22 sites) from the Last Glacial Maximum and mid-Holocene (see Supplementary Methods and Supplementary Table 9). For this, we also used information about the present dominant vegetation type in the fossil sites, extracted from the reference sources (see Supplementary Table 9), to segregate savanna-forest shifts from data coming from stable savanna patches within forest or long-term savanna regions. We also contrasted the predicted patterns with the present location of savanna patches within the Amazon Forest region from ref. 60.
All statistical analyses and data handling were carried out in the R (v.4.0.2) environment, using the previously mentioned packages, in addition to FSA, gridExtra, grid, lattice, lmtest, latticeExtra, olsrr, raster, rgdal, rgeos, sandwich, spatialreg, spdep and vegan, using codes provided in ref. 47.
Reporting Summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Source: Ecology - nature.com