More stories

  • in

    Scenarios of land use and land cover change in the Colombian Amazon to evaluate alternative post-conflict pathways

    Study areaIn Colombia, the Amazon region represents 42.3% of the territory with an estimated area of 483,164 km2. In this area, 14% is dominated by agricultural lands, secondary vegetation and fragmented forests. Currently, 86% of the area corresponds to natural areas in a good state of conservation, where forests are the dominant coverage6. In the northwest area, the region borders the Andean Cordillera and Orinoquía to the north. The political-administrative division includes the departments Amazonas, Caquetá, Guainía, Guaviare, Putumayo and Vaupés, and part of the departments Cauca, Meta, Nariño and Vichada. The human population is estimated at ~ 1.4 million, with a density of 2.5 inhab/km2. Internal conflict and poverty make this region one of the most important population dynamics in the country in terms of displacement36. The geographical location of the study area and the spatial pattern of the loss of forests that occurred between 2002 and 2016 are shown in Fig. 1.Figure 1Study area. Colombian Amazon and location of Amazonian tropical forests that were lost between 2002 and 2016. (Maps were generated using software ArcGis 10.7.1 https://www.esri.com).Full size imageLand cover maps and variables for change analysisThematic land cover maps used in this research were produced by the Colombian Amazon Land Cover Monitoring System (SIMCOBA) of the Amazon Institute for Scientific Research SINCHI (https://siatac.co/simcoba/). SIMCOBA has prepared land cover maps for the periods 2002, 2007, 2012, 2014, 2016 and 2018. Three of the land cover maps prepared were used in this study: 2002, 2016 and 2018 a scale of 1:100,00033. The maps were generated from the visual interpretation of a mosaic of Landsat 5 Thematic Mapper (TM) and Landsat 8 Operational Land Imager (OLI) images, using the PIAO technique (Photo Interprétation Assistée par Ordinateur). The classification categories of the land cover maps were based on the Corine land cover methodology adapted for Colombia37.The SIMCOBA system calculates the annual rates of Amazon forest loss (forest loss/ha/annual) by comparing the cover maps of the last two periods and subtracting from the previous map those forests that are no longer present in the most current map (Fig. 3). This process only considers the forests loss and the permanent forests. New forests due to natural regeneration or restoration are omitted in the calculations6.To facilitate the interpretation of changes and cover transitions, the classification categories of the maps were re-categorized into 7 types: “Amazon forests”, “floodplain forests”, “fragmented forests and secondary vegetation”, “grasslands and shrublands”, “water bodies and wetlands”, “pastures and crops” and “urban and artificialized cover”. The land cover maps were resampled at a resolution of 60 m × 60 m to facilitate the computational analysis of the explanatory model, the simulations of the scenarios, and to keep the detailed spatial resolution of the coverage and explanatory variables16.A geospatial database was created with a set of variables for the cover changes to create an explanatory model for each transition. Driving factors of change are grouped into the following variables: (1) accessibility, (2) climate, (3) landscape features, (4) production practices and environmental degradation, (5) landscape management, (6) socioeconomy, and (7) soil characteristics. We considered 41 explanatory variables (see supplementary information Table S1).Accessibility variables such as roads and navigable rivers were obtained from the geodatabase at a scale of 1:100,000 of the Agustín Codazzi Geographical Institute of Colombia (IGAC). Bioclimatic temperature data were obtained from Worldclim v1.438. Cover variables (e.g., patch sizes Amazon forests and distance to pastures and crops) were created using the software ArcGis (v.10.7.1)39 from the 2002 land cover map to understand which drivers were more influential in the dynamics of land-use changes since 2002 that resulted in the distribution of land cover in 2016.Degradation variables, such as advances of the agricultural frontier, were obtained from the Territorial Environmental Information System of the Colombian Amazon (SIAT-AC)40; livestock density data came from the Colombian Agricultural Institute (ICA); the fire density were processed from MODIS and VIIRS images (https://siatac.co/puntos-de-calor/); and the location of mining titles was obtained from the National Mining Agency.The information on the landscape features and socioeconomic variables was obtained from different sources: (1) the limit of the protected natural areas was provided by the National System of Protected Areas (SINAP)41, (2) the Amazon Forest Reserve areas (Second Law of 1959) were obtained from the Ministry of Environment and Sustainable Development (MADS), (3) the location of the indigenous reservations was provided by the Ministry of the Interior, and (4) the limits of the areas of Indigenous Reservations and the legal status of the Amazonian region were obtained from the SINCHI cartographic database40.Socioeconomic information was spatialized from data from the National Administrative Department of Statistics (DANE). Soil-type data were obtained from IGAG, and topographic and altitudinal variables were derived from a DEM at 100 m resolution from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER V003) sensor42. All explanatory variables were resampled at a resolution of 60 m.Patterns of land cover changes and transitionsThe transformation patterns of territory are mainly defined by human intentions and the activities that these groups plan to develop after making the land cover changes, as well as the dynamics of vegetation regeneration43. In this study, these changes in the study area were obtained and analyzed employing the Land Change Modeller (LCM) module of TerrSet34 and using the land cover maps for 2002 and 2016 as input information (Fig. 2).Figure 2(Source: Open Data—SINCHI Institute https://datos.siatac.co/pages/coberturas) (Maps were generated using software ArcGis 10.7.1 ).Land cover maps 2002, 2016 and 2018, produced by the Colombian Amazon Land Cover Monitoring System (SIMCOBA) of the Amazonian Research Institute SINCHIFull size imageTo represent dynamics and changes in the vegetation during the study period, a total of 14 transitions of greater importance in terms of area were considered (transitions with an area  More

  • in

    Restoration of insect communities after land use change is shaped by plant diversity: a case study on carabid beetles (Carabidae)

    Loreau, M. et al. Biodiversity and ecosystem functioning: current knowledge and future challenges. Science 294, 804–808 (2001).Article 
    ADS 
    CAS 

    Google Scholar 
    Pimm, S. L., Russell, G. J., Gittleman, J. L. & Brooks, T. M. The future of biodiversity. Science 269, 347–350 (1995).Article 
    ADS 
    CAS 

    Google Scholar 
    Newbold, T. et al. Global effects of land use on local terrestrial biodiversity. Nature 520, 45–50. https://doi.org/10.1038/nature14324 (2015).Article 
    ADS 
    CAS 

    Google Scholar 
    Cardoso, P. et al. Scientists’ warning to humanity on insect extinctions. Biol. Conserv. 242, 108426. https://doi.org/10.1016/j.biocon.2020.108426 (2020).Article 

    Google Scholar 
    Hallmann, C. A. et al. More than 75 percent decline over 27 years in total flying insect biomass in protected areas. PLoS ONE 12, e0185809. https://doi.org/10.1371/journal.pone.0185809 (2017).Article 
    CAS 

    Google Scholar 
    Seibold, S. et al. Arthropod decline in grasslands and forests is associated with landscape-level drivers. Nature 574, 671–674. https://doi.org/10.1038/s41586-019-1684-3 (2019).Article 
    ADS 
    CAS 

    Google Scholar 
    Sánchez-Bayo, F. & Wyckhuys, K. A. G. Worldwide decline of the entomofauna: A review of its drivers. Biol. Cons. 232, 8–27. https://doi.org/10.1016/j.biocon.2019.01.020 (2019).Article 

    Google Scholar 
    Yang, L. H. & Gratton, C. Insects as drivers of ecosystem processes. Curr. Opin. Insect Sci. 2, 26–32. https://doi.org/10.1016/j.cois.2014.06.004 (2014).Article 

    Google Scholar 
    Bowler, D. E., Heldbjerg, H., Fox, A. D., de Jong, M. & Böhning-Gaese, K. Long-term declines of European insectivorous bird populations and potential causes. Conserv. Biol. 33, 1120–1130. https://doi.org/10.1111/cobi.13307 (2019).Article 

    Google Scholar 
    Biesmeijer, J. C. et al. Parallel declines in pollinators and insect-pollinated plants in Britain and the Netherlands. Science 313, 351–354. https://doi.org/10.1126/science.1127863 (2006).Article 
    ADS 
    CAS 

    Google Scholar 
    Tscharntke, T., Klein, A. M., Kruess, A., Steffan-Dewenter, I. & Thies, C. Landscape perspectives on agricultural intensification and biodiversity – ecosystem service management. Ecol. Lett. 8, 857–874. https://doi.org/10.1111/j.1461-0248.2005.00782.x (2005).Article 

    Google Scholar 
    Scherber, C. et al. Bottom-up effects of plant diversity on multitrophic interactions in a biodiversity experiment. Nature 468, 553–556. https://doi.org/10.1038/nature09492 (2010).Article 
    ADS 
    CAS 

    Google Scholar 
    Siemann, E., Tilman, D. & Haarstad, J. Insect species diversity, abundance and body size relationships. Nature 380, 704–706. https://doi.org/10.1038/380704a0 (1996).Article 
    ADS 
    CAS 

    Google Scholar 
    Borer, E. T., Seabloom, E. W. & Tilman, D. Plant diversity controls arthropod biomass and temporal stability. Ecol. Lett. 15, 1457–1464. https://doi.org/10.1111/ele.12006 (2012).Article 

    Google Scholar 
    Ebeling, A. et al. Plant diversity effects on arthropods and arthropod-dependent ecosystem functions in a biodiversity experiment. Basic Appl. Ecol. 26, 50–63. https://doi.org/10.1016/j.baae.2017.09.014 (2018).Article 

    Google Scholar 
    Ebeling, A. et al. Plant diversity induces shifts in the functional structure and diversity across trophic levels. Oikos 127, 208–219. https://doi.org/10.1111/oik.04210 (2018).Article 

    Google Scholar 
    Ebeling, A. et al. Plant diversity impacts decomposition and herbivory via changes in aboveground arthropods. PLoS ONE 9, e106529. https://doi.org/10.1371/journal.pone.0106529 (2014).Article 
    ADS 
    CAS 

    Google Scholar 
    Marquard, E. et al. Plant species richness and functional composition drive overyielding in a six-year grassland experiment. Ecology 90, 3290–3302 (2009).Article 

    Google Scholar 
    Tilman, D. et al. Diversity and productivity in a long-term grassland experiment. Science 294, 843–845. https://doi.org/10.1126/science.1060391 (2001).Article 
    ADS 
    CAS 

    Google Scholar 
    Simons, N. K. et al. Resource-mediated indirect effects of grassland management on arthropod diversity. PLoS ONE 9, e107033. https://doi.org/10.1371/journal.pone.0107033 (2014).Article 
    ADS 
    CAS 

    Google Scholar 
    Wardle, D. A., Nicholson, K. S., Bonner, K. I. & Yeates, G. W. Effects of agricultural intensification on soil-associated arthropod population dynamics, community structure, diversity and temporal variability over a seven-year period. Soil Biol. Biochem. 31, 1691–1706 (1999).Article 
    CAS 

    Google Scholar 
    Luff, M. L. & Rushton, S. P. The ground beetle and spider fauna of managed and unimproved upland pasture. Agr. Ecosyst. Environ. 25, 195–206 (1989).Article 

    Google Scholar 
    Dennis, P., Young, M. R., Howard, C. L. & Gordon, I. J. The response of epigeal beetles (Col, Carabidae, Staphylinidae) to varied grazing regimes on upland Nardus stricta grasslands. J. Appl. Ecol. 34, 433–443 (1997).Article 

    Google Scholar 
    Murdoch, W. W., Evans, F. C. & Peterson, C. H. Diversity and pattern in plants and insects. Ecology 53, 819–829 (1972).Article 

    Google Scholar 
    Siemann, E., Tilman, D., Haarstad, J. & Ritchie, M. Experimental tests of the dependence of arthropod diversity on plant diversity. Am. Nat. 152, 738–750 (1998).Article 
    CAS 

    Google Scholar 
    Joern, A. & Laws, A. N. Ecological mechanisms underlying arthropod species diversity in grasslands. Annu. Rev. Entomol. 58, 19–36. https://doi.org/10.1146/annurev-ento-120811-153540 (2013).Article 
    CAS 

    Google Scholar 
    Hunter, M. D. & Price, P. W. Playing chutes and ladders: Heterogeneity and relative roles of bottom-up and top-down forces in natural communities. Ecology 73, 724–732 (1992).Article 

    Google Scholar 
    Knops, J. M. H. et al. Effects of plant species richness on invasion dynamics, disease outbreaks, insect abundances and diversity. Ecol. Lett. 2, 286–293 (1999).Article 
    CAS 

    Google Scholar 
    Thiele, H. U. Carabid beetles in their environment. A study on habitat selection by adaptions in physiology and behaviour. (Springer- Verlag, 1977).Harvey, J. A., van der Putten, W. H., Turin, H., Wagenaar, R. & Bezemer, T. M. Effects of changes in plant species richness and community traits on carabid assemblages and feeding guilds. Agr. Ecosyst. Environ. 127, 100–106 (2008).Article 

    Google Scholar 
    Luff, M. L. Use of Carabids as environmental indicators in grasslands and cereals. Ann. Zool. Fenn. 33, 185–195 (1996).
    Google Scholar 
    Kotze, D. J. et al. Forty years of carabid beetle research in Europe—from taxonomy, biology, ecology and population studies to bioindication, habitat assessment and conservation. ZooKeys https://doi.org/10.3897/zookeys.100.1523 (2011).Article 

    Google Scholar 
    Barnes, A. D. et al. Biodiversity enhances the multitrophic control of arthropod herbivory. Sci. Adv. 6, eabb6603. https://doi.org/10.1126/sciadv.abb6603 (2020).Article 
    ADS 

    Google Scholar 
    Bianchi, F. J. J. A., Booij, C. J. H. & Tscharntke, T. Sustainable pest regulation in agricultural landscapes: A review on landscape composition, biodiversity and natural pest control. Proc. R. Soc. B: Biol. Sci. 273, 1715–1727. https://doi.org/10.1098/rspb.2006.3530 (2006).Article 
    CAS 

    Google Scholar 
    Lövei, G. L. & Magura, T. Ground beetle (Coleoptera: Carabidae) diversity is higher in narrow hedges composed of a native compared to non-native trees in a Danish agricultural landscape. Insect Conserv. Divers. 10, 141–150. https://doi.org/10.1111/icad.12210 (2017).Article 

    Google Scholar 
    Loreau, M. Consumers as maximizers of matter and energy flow in ecosystems. Am. Nat. 145, 22–42. https://doi.org/10.1086/285726 (1995).Article 

    Google Scholar 
    Mielke, L. et al. Nematode grazing increases the allocation of plant-derived carbon to soil bacteria and saprophytic fungi, and activates bacterial species of the rhizosphere. Pedobiologia 90, 150787. https://doi.org/10.1016/j.pedobi.2021.150787 (2022).Article 

    Google Scholar 
    Holland, J. M. & Luff, M. L. The effects of agricultural practices on Carabidae in temperate agroecosystems. Integr. Pest Manag. Rev. 5, 109–129. https://doi.org/10.1023/A:1009619309424 (2000).Article 

    Google Scholar 
    Roscher, C. et al. The role of biodiversity for element cycling and trophic interactions: An experimental approach in a grassland community. Basic Appl. Ecol. 5, 107–121 (2004).Article 

    Google Scholar 
    Weisser, W. W. et al. Biodiversity effects on ecosystem functioning in a 15-year grassland experiment: Patterns, mechanisms, and open questions. Basic Appl. Ecol. https://doi.org/10.1016/j.baae.2017.06.002 (2017).Article 

    Google Scholar 
    Freude, H., Harde, K. W. & Lohse, G. A. Die Käfer Mitteleuropas Bd.1–11. (Goecke & Evers, 1965–83).Koch, K. Die Käfer Mitteleuropas. Ökologie Bd.1–6. (Goecke & Evers, 1989–95).R: A language and environment for statistical computing (R Foundation for Statistical Computing, Vienna, Austria, 2021).Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48. https://doi.org/10.18637/jss.v067.i01 (2015).Article 

    Google Scholar 
    Schmid, B., Baruffol, M., Wang, Z. & Niklaus, P. A. A guide to analyzing biodiversity experiments. J. Plant Ecol. 10, 91–110. https://doi.org/10.1093/jpe/rtw107 (2017).Article 

    Google Scholar 
    Zuur, A., Ieno, E. N., Walker, N., Saveliev, A. A. & Smith, G. M. Mixed effects models and extensions in ecology with R. (Springer, 2009).Oksanen, J. et al. vegan: Community Ecology Package v. 2.6–2 (2022).Lenth, R. et al., emmeans: Estimated Marginal Means, aka Least-Squares Means v. 1.8.1-1 (2022).Lovei, G. L. & Sunderland, K. D. Ecology and behavior of ground beetles (Coleoptera: Carabidae). Annu. Rev. Entomol. 41, 231–256 (1996).Article 
    CAS 

    Google Scholar 
    Ravenek, J. M. et al. Long-term study of root biomass in a biodiversity experiment reveals shifts in diversity effects over time. Oikos 123, 1528–1536. https://doi.org/10.1111/oik.01502 (2014).Article 

    Google Scholar 
    Root, R. Organization of a plant -arthropod association in simple and diverse habitats: the fauna of collards (Brassica oleracea). Ecol. Monogr. 43, 95–124 (1973).Article 

    Google Scholar 
    Duelli, P. & Obrist, M. K. Regional biodiversity in an agricultural landscape: The contribution of seminatural habitat islands. Basic Appl. Ecol. 4, 129–138 (2003).Article 

    Google Scholar 
    Perner, J. & Malt, S. Assessment of changing agricultural land use: Response of vegetation, ground-dwelling spiders and beetles to the conversion of arable land into grassland. Agr. Ecosyst. Environ. 98, 169–181 (2003).Article 

    Google Scholar 
    Purtauf, T., Dauber, J. & Wolters, V. Carabid communities in the spatio-temporal mosaic of a rural landscape. Landsc. Urban Plan. 67, 185–193 (2004).Article 

    Google Scholar 
    Eisenhauer, N. et al. Biotic interactions, community assembly, and eco-evolutionary dynamics as drivers of long-term biodiversity–ecosystem functioning relationships. Res. Ideas Outcomes https://doi.org/10.3897/rio.5.e47042 (2019).Article 

    Google Scholar 
    Guerrero-Ramirez, N. R. et al. Diversity-dependent temporal divergence of ecosystem functioning in experimental ecosystems. Nat. Ecol. Evol. 1, 1639–1642. https://doi.org/10.1038/s41559-017-0325-1 (2017).Article 

    Google Scholar 
    Reich, P. B. et al. Impacts of biodiversity loss escalate through time as redundancy fades. Science 336, 589–592. https://doi.org/10.1126/science.1217909 (2012).Article 
    ADS 
    CAS 

    Google Scholar 
    Isbell, F. I., Polley, H. W. & Wilsey, B. J. Biodiversity, productivity and the temporal stability of productivity: Patterns and processes. Ecol. Lett. 12, 443–451. https://doi.org/10.1111/j.1461-0248.2009.01299.x (2009).Article 

    Google Scholar 
    Blake, S., Foster, G. N., Fisher, G. E. J. & Ligertwood, G. L. Effects of management practices on the carabid faunas of newly established wildflower meadows in southern Scotland. Ann. Zool. Fenn. 33, 139–147 (1996).
    Google Scholar 
    Boetzl, F. A., Krimmer, E., Krauss, J. & Steffan-Dewenter, I. Agri-environmental schemes promote ground-dwelling predators in adjacent oilseed rape fields: Diversity, species traits and distance-decay functions. J. Appl. Ecol. 56, 10–20. https://doi.org/10.1111/1365-2664.13162 (2019).Article 

    Google Scholar 
    Knapp, M., Seidl, M., Knappová, J., Macek, M. & Saska, P. Temporal changes in the spatial distribution of carabid beetles around arable field-woodlot boundaries. Sci. Rep. 9, 8967. https://doi.org/10.1038/s41598-019-45378-7 (2019).Article 
    ADS 
    CAS 

    Google Scholar  More

  • in

    Eco-ISEA3H, a machine learning ready spatial database for ecometric and species distribution modeling

    Our objective in developing the Eco-ISEA3H database37 was to compile a coordinated, global set of tabular data, characterizing environmental conditions and the geographic distributions of large mammalian species. The database was built on the ISEA3H DGGS, a multi-resolution system of global grids, each grid dividing the Earth’s surface into discrete, equal-area hexagonal cells. These cells constitute areal units of observation, uniformly resampling data provided in different coordinate reference systems, spatial resolutions, geographic data models, and file formats. We included data at six consecutive ISEA3H resolutions, in which cell centroid spacing ranges from 29 kilometers to approximately 450 kilometers.Eco-ISEA3H themes and variables were derived from 17 geospatial data sources, and represent 3,033 features to be used for ML-based predictive modeling. Source datasets were published in raster or vector format, data models built on fundamentally different representations of spatial phenomena. Raster datasets comprise regular arrays of pixels, each pixel holding a value, while vector datasets comprise point, line, and polygon features, each feature defined by one or more (x, y) coordinate pairs and attributed with one or more values. Our task was to integrate these disparate source datasets, resampling and summarizing the values of raster pixels and vector features via the discrete, equal-area cells of the ISEA3H global grid system. The hexagonal cells on which the Eco-ISEA3H database37 is built thus serve as unifying observational units for SDM and ecometric analysis and modeling.From the statistical and ML perspective, each areal observational unit is characterized by (1) a set of environmental variables, representing climatic conditions, soil and near-surface lithology, land cover, and physical geography; and (2) a set of occurrence variables, representing the present and estimated natural distributions of large mammalian species. Predictive modeling tasks for statistical and ML modeling can be formulated in two directions: predicting species’ occurrences as a function of climatic and other environmental conditions (as in SDM studies), or predicting climatic and other environmental conditions as a function of species’ occurrences and functional traits (as in ecometric studies).Spatial units of observationTo study continuous spatial phenomena over a region of interest, it is often necessary to divide the region into a number of discrete, areal observational units, which may be used in statistical summaries and/or modeling. Machine learning methods for ecometric and species distribution modeling require discrete observational units, each characterized by two sets of variables, one describing environmental conditions, the other species’ geographic distributions. A major question in data representation concerns the form of these units; defining discrete spatial units of observation constitutes a well-known problem in geography, termed the modifiable areal unit problem (MAUP)38. As we change the size of proposed observational units, or change the boundaries between units while holding unit areas constant, measures of interest within these units – and derived summary statistics and model parameters – may differ; these are termed the “scale” and “zone” effects, respectively38.Our objective in utilizing the ISEA3H DGGS34 was to implement a robust spatial division of the Earth’s surface. The grid cells of the DGGS discretize the Earth’s sphere, forming, at each DGGS resolution, a global set of areal observational units with which to sample and summarize source datasets. To be optimally effective in the observation, simulation, and visualization of spatial phenomena, such a grid must meet certain structural criteria. We propose, modifying the Goodchild Criteria39, the DGGS grid must contain (1) contiguous, (2) equivalent observational units, (3) minimizing intra-unit variability, (4) having uniform topology with neighboring units, and (5) being visually effective, facilitating interpretation and communication. Each criterion will be discussed in detail; further, we will argue the ISEA3H DGGS selected for this study satisfies these criteria.Contiguity & congruencyWe suggest that a regular tiling maximally satisfies the criteria of (1) contiguity and (2) equivalence. A tiling is simply a set of shapes which cover a plane without gaps or overlaps40. A regular tiling is one of a class of tilings in which the tiles – our observational units – are highly equal; such tilings are monohedral, and composed of congruent, regular (equiangular and equilateral) polygons. Thus, regular tilings are also highly symmetrical, being vertex-, edge-, tile-, and flag-transitive. Three regular polygons may be used to create a regular tiling: the equilateral triangle, the square, and the regular hexagon40.With this suggestion, we follow common convention; in ecology, grids of square (or rectangular) cells are most often utilized, motivated in part by the use of raster datasets41, made of rectilinear rows and columns of pixels. However, it should be noted that while the square cells of these grids are equal in the coordinate reference system in which they are defined, such cells are rarely congruent, or indeed even square, on the Earth’s surface. The properties of the ISEA projection selected for this DGGS – area preservation, and relatively low angular distortion – serve to retain considerable congruency when inversely projecting grid cells to the spherical surface of the Earth.CompactnessTo accurately represent the spatially continuous phenomena of the Earth system, the grid cells of a DGGS – the areal observational units used in summarizing, modeling, and visualizing – must effectively discretize these phenomena. Thus, the DGGS must be structured such that (3) intra-unit variability is minimized, and inter-unit variability is maximized. In this way, patterns of variation among units more accurately represent patterns of variation inherent in the phenomena.Intra-unit variability may be minimized, in expectation, by compact observational units. Tobler’s oft-cited first law of geography serves as explanation: “everything is related to everything else, but near things are more related than distant things”42. Thus, compact units, in which all portions of the interior are nearer each other, are expected to contain less interior variability than elongated units, in which portions of the interior may be more distant. Given these properties, compact units are optimal in the context of DGGS development, elongated units in the context of efficient ecological sampling.Regular hexagons are the most compact of the three polygons – the equilateral triangle, square, and regular hexagon – admitting regular tilings. This compactness may be expressed in several related and complementary ways. First, of any equal-area tiling, regular hexagons have the minimum possible ratio of perimeter to area43. In minimizing perimeter length per unit area, regular hexagons are thus the most circle-like of the polygons admitting equal-area tilings. Relatedly, regular hexagonal packing is the highest-density arrangement of equal-area circles on a plane44.Finally, a regular hexagonal lattice optimally quantizes a plane; of the polygons admitting regular tilings, regular hexagons minimize the mean squared distance of any point to the nearest polygon centroid45. This distance, or “dimensionless second moment,” quantifies the more qualitative notion of interior nearness discussed in relation to Tobler’s Law.TopologyIn addition to maximally satisfying the (3) compactness criterion, regular hexagons have a topological advantage over equilateral triangles and squares. Of these three regular polygons, hexagons have the simplest relationship with neighbors in a tiling or grid, each (4) uniformly sharing an edge with the six adjacent hexagons forming its first-order neighborhood. Triangles and squares, in contrast, share only a single vertex with three or four neighbors, respectively, and an edge with three or four neighbors, complicating the definition of neighborhood in these grids.It follows that hexagonal topology has greater angular resolution than edge-based triangular or square topologies; movement may be simulated between cells in six directions, rather than in three or four, respectively. These properties – neighborhood simplicity and angular resolution – were confirmed by Golay46, in the context of pattern transformation operations on two-dimensional arrays. Further, these properties likely account for the widespread use of hexagonal grids in strategy board games, since these grids were introduced in the early 1960s47.Differing grid topologies affect the results of ecological models simulating dispersal. White and Kiester48, for example, found the topology of the network of communities in a neutral community ecology model – in which simulated communities had hexagonal neighborhoods, or von Neumann, Moore, or Margolus neighborhoods – affected modeled species abundances and diversities, but in complex ways, which differed given different model parameter values. (Note that the four neighbors with which a square cell shares an edge are termed its rook, or von Neumann neighborhood, and these plus the four neighbors with which it shares a single vertex its queen, or Moore neighborhood.)VisualizationFinally, in addition to these gains in representational accuracy, (5) hexagonal tilings are more visually effective than square tilings. Whether used in cartography or other two-dimensional data visualization, tilings inevitably create visual lines, artifacts of the lattice of shared edges between tiles49. Given our “sense of gravitational balance,” Carr et al.49 argue the horizontal and vertical lines of square tilings strongly distract the human eye, obscuring data-driven patterns in a dataset so visualized. The non-orthogonal lines of hexagonal tilings, however, feature less prominently, and thus distract less from patterns of interest49.Note that this is not an issue of aesthetics only: maps are often essential tools in scientific reasoning and communication, and effective visualization is important. Indeed, Carr et al.49 suggest this visual advantage makes a stronger case for hexagonal tilings than the representational advantages discussed previously.DGGS sampling workflowsThe set of scripted workflows developed to incorporate spatial datasets into the Eco-ISEA3H database37 utilize published spatial libraries and packages for Python and R, and include several validation steps, intended to verify the integrity of source datasets and the fidelity of the transfer to the DGGS. Workflows developed for raster datasets are presented in Fig. 1, and workflows for vector datasets in Fig. 2.Fig. 1Workflow developed to incorporate raster datasets into the ISEA3H DGGS.Full size imageFig. 2Workflow developed to incorporate vector datasets into the ISEA3H DGGS.Full size imageTo begin, one general principle guides each workflow: each source dataset is processed in its native coordinate reference system. In all cases, a representation of the DGGS is developed in the coordinate reference system of the source dataset, and used in summarizing that dataset. The guiding premise here is that the spatial dataset is as the authors intended it in the coordinate reference system in which it is published and distributed.This is especially relevant for vector polygon datasets. Consider, for example, certain species’ range polygons published by the IUCN Red List50; these polygons are defined only roughly, having relatively few, widely spaced vertices, connected by arcs many hundreds of kilometers in length. These arcs are “straight” in the plate carrée projection with which the dataset’s WGS84 latitude/longitude coordinates are visualized by default. If vertex coordinates were projected into another coordinate reference system, the arcs would be similarly “straight” in this new system, and thus potentially trace very different paths across the Earth’s surface. Absent information to the contrary, we assume the arcs are as intended in the reference system in which the data are distributed.The spatial structure of raster datasets depends similarly on each dataset’s coordinate reference system; rasters are made of rows and columns of pixels, rectilinear and orthogonal only in the raster’s native coordinate reference system. We assume raster pixels are “atomic” units, each indivisible and representative of the area it natively covers. Thus, we query the DGGS at each pixel’s centroid, and assign the pixel wholly to the coincident DGGS cell.Raster dataset processingIf necessary, source raster datasets were first converted to the GeoTIFF file format, so that the files were readable in the open-source GIS software used later in the processing workflow. GeoTIFF files are simply Tag Image File Format (TIFF) image files with embedded georeferencing information, describing the dataset’s spatial extent and coordinate reference system. Hierarchical Data Format Release 4 (HDF4) files were converted to GeoTIFF format using the Geospatial Data Abstraction Library (GDAL) translate utility51.Next, raster tiles containing ISEA3H hexagon identification (HID) indexing numbers were generated; these integer HIDs uniquely identify each cell at a given ISEA3H resolution. A set of HID raster tiles was required for each source raster dataset, for each ISEA3H resolution, because (1) GeoTIFF rasters are able to hold only a single value at each pixel; and (2) HIDs sequentially number cells at a given ISEA3H resolution, from 1 to the number of cells present at that resolution. Thus, HIDs are not unique between resolutions; HID 84, for example, identifies a cell at each ISEA3H resolution 2 and higher.The HID raster tiles generated for a source raster dataset matched that dataset’s grid resolution, extent, and coordinate reference system precisely; thus, there was a one-to-one correlation between the pixels of the HID raster tiles and the source raster dataset tiles. For each tile, pixel centroid coordinates were passed to the dggridR package52 for R, which returned the ISEA3H cell identification number for that location. In this way, the pixels of the source raster were treated as indivisible units, assigned wholly to a particular HID on the basis of each pixel’s centroid. HID rasters were written in GeoTIFF format using the raster package53 for R.In equal-area projected coordinate reference systems, simple counts of the number of raster pixels assigned to each HID were sufficient to determine each ISEA3H cell’s total area. In all other cases – for example, for raster datasets using the World Geodetic System 1984 (WGS84) coordinate reference system – raster tiles containing pixel areas were generated. These areas were calculated by passing each pixel’s corner coordinates to the GeographicLib library54 for Python.Finally, source raster dataset tiles, HID raster tiles, and area raster tiles (for source rasters using non-authalic coordinate reference systems) were superimposed to generate summary tabular files, describing the features of the source raster dataset by ISEA3H cell. The specifics of this process, which utilized functions of the raster package53 for R, depended on whether the source raster contained discrete, categorical values, or continuous, real-numbered values.Discrete themesFor each source raster dataset containing discrete pixel values, one or more of the following summary statistics were calculated. While the centroid attribute requires a simple point sample, the fraction and mode attributes are area-integrated, and involve a multiple-step sampling process. For rasters using an authalic coordinate reference system, the raster package’s crosstab function53 was used to generate a contingency table for each tile; applied to source raster and HID raster tiles, the function tallied the number of pixels of each class coincident with each HID, for each tile. These tile-specific tables were then summed, to obtain total counts of pixels of each class within each HID.For rasters using a non-authalic coordinate reference system, area raster tiles were required as well. For each tile, a vector of classes present in the source raster was assembled. For each of these classes in turn, a mask raster tile was generated, retaining pixels belonging to the class, and screening pixels belonging to all other classes. This mask was applied to the area raster tile, and retained pixels were summed within each HID using the raster package’s zonal function53. Thus, a contingency table was compiled for each raster tile, containing the area of each class within each HID. Finally, these tile-specific tables were summed, to obtain the total area of each class within each HID.

    Centroid. The centroid attribute records the categorical value occurring at each ISEA3H cell’s centroid. Where the source raster dataset contains a null value at a centroid, the cell is assigned a flag signifying no value is available.

    Fraction. The fraction attributes record the proportion of each ISEA3H cell’s area covered by each categorical value. For example, the Köppen-Geiger climate classification system, as implemented by Beck et al.55, includes 30 classes, listed in Table 4. Thus, each ISEA3H cell has an associated set of 30 fraction attributes for this dataset, recording the proportions of the cell’s area covered by the 30 categorical values, from tropical rainforest (Af) to polar tundra (ET).

    Mode. The mode attribute records the categorical value covering the greatest proportion of each ISEA3H cell’s area. For example, if an ISEA3H cell had a fraction value of 0.4 for some hypothetical categorical value A, 0.3 for B, and 0.3 for C, it would be assigned a mode value of A. A mode attribute is specified for cells in which the sum of the fraction attributes is greater than or equal to 0.2; where fraction attributes total less than 0.2, a flag signifying no value is assigned.

    Continuous variablesFor each source raster dataset containing continuous pixel values, one or more of the following summary statistics were calculated. Again, the centroid attribute requires only a simple point sample, while the mean attribute is area-integrated, requiring area raster tiles for source rasters using a non-authalic coordinate reference system.

    Centroid. The centroid attribute records the continuous value occurring at each ISEA3H cell’s centroid. Where the source raster dataset contains a null value at a centroid, the cell is assigned a flag signifying no value is available.

    Mean. The mean attribute records the area-weighted arithmetic mean of the continuous values of raster pixels within each ISEA3H cell. For raster datasets in authalic coordinate reference systems, the area-weighted mean is equivalent to the simple mean of the values of raster pixels within each cell; however, in all other cases, pixel values are weighted by pixel areas per the equation below, in which wi and xi indicate the area and value, respectively, of each pixel i within an ISEA3H cell containing n pixels.

    $$overline{x}=frac{{sum }_{i=1}^{n}{w}_{i}{x}_{i}}{{sum }_{i=1}^{n}{w}_{i}}$$For each tile, source raster values and area values were multiplied, pixel by pixel, using the raster package’s * arithmetic operator53. The resulting product raster tile, as well as the area raster tile, were then summed within each HID using the raster package’s zonal function53. Finally, these tile-specific tables were summed, to obtain both the numerator (summed product values) and denominator (summed area values) for the above equation, for each HID.Vector dataset processingSource vector datasets incorporated into the Eco-ISEA3H database37 contain polygon features, discrete areas assigned a categorical value. A dataset may (1) contain polygons of several different classes; for example, the vector shapefile published by Olson et al.56 contains ecoregion polygons, each assigned to one of several biogeographic realms. Alternatively, a dataset may (2) represent a single class, with polygons indicating class presence; for example, the shapefiles published by the IUCN Red List50 each represent a species’ geographic range, with polygons indicating regions the species is present. In both cases, the summary statistics discussed in reference to raster datasets containing discrete values may be calculated.Prior to inclusion in the Eco-ISEA3H database37, source vector datasets were preprocessed. To simplify the geographic representation of the class(es) of interest – that is, to remove unnecessary polygon boundaries – dataset polygons were dissolved, either on the class attribute in case (1), or globally in case (2), using the QGIS open-source desktop GIS application. The geodesic areas of dissolved polygons were then calculated using the GeographicLib library54. Finally, the geometries of dissolved polygons were checked for conformance with the OGC Simple Feature Access standard57 using the Shapely library58 for Python, ensuring these features served as valid input in the processing workflow to follow.The intersection of source dataset polygons and ISEA3H cell polygons is central to the vector processing workflow. Source polygons result from the preliminary simplification and verification steps just discussed; cell polygons result from polygonizing a set of HID raster tiles for the ISEA3H resolution of interest. The polygonizing procedure utilized the open-source GDAL command-line tools polygonize and ogrmerge51, as well as the GeographicLib54 and Shapely58 libraries. Polygonizing HID raster tiles of the appropriate coordinate reference system (specifically, the system matching that of the source polygon dataset) ensured HID polygon boundaries displayed both proper geodesic curvature and the shape distortion induced by the ISEA map projection.Intersection is a set-theoretic operation, returning polygons representing each coincident class/HID combination. The operation was implemented via the Shapely library58, and the geodesic areas of intersected polygons were calculated via the GeographicLib library54. Note that the scripted intersection tools developed for the Eco-ISEA3H database37 allow limiting the ISEA3H cells included in a single tool run, to break the processing of large datasets into manageable pieces. Runs may be limited to a user-specified range of HIDs. Additionally, if cells at the next coarser or finer ISEA3H resolution have been intersected with the source dataset, cells retained by the operation may be used as a spatial index; a list of coincident HIDs at the ISEA3H resolution of interest may be generated, and used to limit tool runs.An output shapefile is written, containing intersected polygons attributed with the geodesic area, the HID, and in case (1), the source class. Next, an additional verification of the geometries of these intersected polygons is performed. Each intersected polygon is superimposed over the original ISEA3H cell polygon having the same HID. If intersected polygons have too few vertices to be valid, or are not contained by the original cell polygon from which each was derived, these polygons are flagged for review and revision. This step was implemented to catch geometry errors observed early in the development of the Eco-ISEA3H intersection tools.Finally, the geodesic areas of intersected polygons are totaled, and the total area of each class within each HID is calculated. Dividing by the geodesic areas of the original ISEA3H cell polygons, these class totals are expressed as fractions of each cell’s total area. In two final verification steps, (1) the total intersected area of each class, across all HIDs, is compared to the area of the same class in the source dataset; and (2) class fraction values are confirmed to be less than or equal to unity within each HID. Deviations are flagged for review and revision.Data sources & themesThe Eco-ISEA3H database37 incorporates 17 source datasets, characterizing the Earth’s climate, geology, land cover, and physical geography, as well as human population density and the geographic ranges of nearly 900 large mammalian species. Data sources are listed in Table 1. We first present a brief overview of these sources, and describe sources and themes in greater detail in the following sections.Table 1 Source datasets and themes included in the Eco-ISEA3H database37. Each dataset is described by full and abbreviated name, source, spatial resolution (for datasets published/distributed at more than one resolution), version, and scenario. Each theme is described by full and abbreviated name and type (whether it contains discrete, categorical values or continuous, real-valued variables).Full size tableClimate is characterized primarily by temperature- and precipitation-based averages and extremes, summarized over the past 50 to 70 years, and forecasted for 40 to 60 years in the future under the RCP 8.5 climate change scenario59; data sources include WorldClim30,31, ENVIREM60, and the ETCCDI extremes indices derived by Sillmann et al.61,62 from ERA-4063 and CCSM464. Additionally, present climate is classified via the Köppen-Geiger climate classification system, from GLOH2O55. Geological data include soil types, from the Digital Soil Map of the World (DSMW)65; near-surface rock types, from the Global Lithological Map (GLiM)66; and sedimentary basin types67. Human geography is quantified by human population density, from the Gridded Population of the World (GPW)68. Land cover is described by the International Geosphere-Biosphere Programme (IGBP) cover classification scheme, from MCD12Q169; and by percent tree, non-tree, and non-vegetated cover, from MOD44B70. The Earth’s physical geography is characterized by continental and island landmasses, from Natural Earth; lakes and wetlands, from the Global Lakes and Wetlands Database (GLWD)71; biogeographic realms56; and terrestrial topography and ocean bathymetry, from ENVIREM60 and SRTM30_PLUS72. Finally, distributional data include the present and estimated natural ranges of large mammalian species, from the IUCN Red List50 and the Phylogenetic Atlas of Mammal Macroecology (PHYLACINE)73,74.Climate

    ENVIREM. The ENVIREM (ENVIronmental Rasters for Ecological Modeling) dataset60 contains 16 climatic variables derived from WorldClim v1.4 monthly temperature and precipitation30, and extraterrestrial radiation. These are intended to compliment the WorldClim v1.4 bioclimatic variables30, capturing additional environmental features directly relevant to floral and faunal physiology and ecology60. Source rasters at 30 arc-second resolution were summarized by area-weighted mean at ISEA3H resolutions 8 and 9. Variable codes, descriptions, and units are listed in Table 2. Title and Bemmels60, and references therein, provide full definitions and calculation methods for these variables.Table 2 Codes, descriptions, and units for the 16 ENVIREM climatic variables, from Title and Bemmels60.Full size table

    ETCCDI. A comprehensive set of 27 climate extremes indices was defined by the Expert Team on Climate Change Detection and Indices (ETCCDI); these generally capture “moderate” extremes, having recurrence intervals of a year or shorter, and are based on observed/simulated daily temperature and precipitation61,62. Sillmann et al.61,62 derive these indices from results of a number of global climate models and atmospheric reanalyses, several of which were incorporated in the Eco-ISEA3H database37. Given the relatively low-resolution grids used in modeling and reanalysis, these source rasters were interpolated to ISEA3H cell centroids by inverse (geodesic) distance weighting (IDW). Variable codes, descriptions, and units are listed in Table 3. Sillmann et al.61 provide full definitions and calculation methods for these indices.Table 3 Codes, descriptions, and units for the 27 ETCCDI climate extremes indices, from Sillmann et al.61,62.Full size table

    The Eco-ISEA3H database37 includes ETCCDI variables based on results of the ERA-40 reanalysis63, produced by the European Centre for Medium-Range Weather Forecasts (ECMWF). The reanalysis combines past meteorological observations with a weather forecasting model, producing a global representation of the state of the atmosphere for each reanalysis time step, usually a six-hour interval63. These were averaged for the period 1958 to 2001, the 44 full years for which the ERA-40 reanalysis was conducted, and were interpolated to ISEA3H resolutions 5 to 9.Additionally, the database includes ETCCDI variables based on results of the Community Climate System Model v4 (CCSM4), a global climate model developed for CMIP564. These were averaged for the period 1950 to 2000, to match the approximate period covered by WorldClim v1.4, and for the period 2061 to 2080, to match the final interval for which CCSM4 model results were downscaled/debiased using WorldClim v1.430. Variables were interpolated to ISEA3H resolution 9.ETCCDI variables for this latter period represent conditions under Representative Concentration Pathway (RCP) 8.5, the RCP resulting in the highest radiative forcing (8.5 W/m2) by 210059. This scenario was selected such that future conditions maximally different from the present might be considered; in RCP 8.5, rapid population growth, and relatively slow growth in per capita income and technological development, lead to high energy demand without associated climate mitigation policies, resulting in greenhouse gas emissions and atmospheric concentrations increasing significantly in the coming decades59.

    Köppen-Geiger Climate Classification. As implemented by Beck et al.55, the Köppen-Geiger system classifies the Earth’s terrestrial climates into five primary classes, and further into 30 subclasses, based on a set of threshold criteria referencing monthly mean temperature and precipitation. These climate classes are ecologically significant, as regions within each class support floral communities sharing common characteristics. Beck et al.55 utilize four climatic datasets, including WorldClim v1.x and v2.x, adjusted to the period 1980 to 2016, to define the present-day classes incorporated in the Eco-ISEA3H database37. The source raster at 30 arc-second resolution was summarized by fraction and mode at ISEA3H resolution 9. Variable codes and descriptions are listed in Table 4.Table 4 Codes and descriptions for the 30 Köppen-Geiger climate classes, from Beck et al.55.Full size table

    WorldClim v1.4. The first-generation WorldClim dataset30 contains four monthly themes, each with 12 variables, characterizing monthly temperature and precipitation; additionally, it contains 19 bioclimatic variables, derived from the monthly variables, capturing biologically relevant seasonal and annual features of the climate system. These bioclimatic variables, first developed for the BIOCLIM species distribution modeling (SDM) package75, are used extensively in SDM studies; a recent synthesis found most were included in more than 1,000 published MaxEnt SDMs (of 2,040 reviewed)76.

    WorldClim monthly temperature and precipitation rasters are interpolated from weather station observations averaged for the approximate period 1950 to 2000. The interpolation was done using thin plate smoothing splines, with latitude, longitude, and elevation as predictor variables30. These rasters characterize present-day climate, and further served as an observational baseline with which the predictions of CMIP5 global climate models were downscaled and bias-corrected.The 19 bioclimatic variables, for both present-day and future conditions (the latter averaged for the period 2061 to 2080, from the CCSM4 RCP 8.5 simulation), were incorporated into the Eco-ISEA3H database37; source rasters at 30 arc-second resolution were summarized by area-weighted mean at ISEA3H resolution 9. Variable codes, descriptions, and units are listed in Table 5. O’Donnell and Ignizio77 provide full definitions and calculation methods for these variables.Table 5 Codes, descriptions, and units for the 19 WorldClim bioclimatic variables, from v1.430 and v2.031.Full size table

    WorldClim v2.0. The second-generation WorldClim dataset31 contains seven monthly themes, each with 12 variables, characterizing monthly temperature, precipitation, solar radiation, wind speed, and vapor pressure; additionally, it contains the standard set of 19 bioclimatic variables, derived from monthly temperature and precipitation.

    As in the first-generation dataset, monthly rasters were interpolated from weather station observations, averaged here for the approximate period 1970 to 200031. Again, thin plate smoothing splines were used in the interpolation, but with additional covariates included for one or more interpolated features: distance to coast, computed extraterrestrial radiation, and three satellite-derived observations – cloud cover, and maximum and minimum land surface temperature, from the Moderate Resolution Imaging Spectroradiometer (MODIS) instrument.The 12 source rasters for each of the seven monthly themes, at 30 arc-second resolution, were summarized by centroid at ISEA3H resolutions 5 to 10. Additionally, the 19 source bioclimatic rasters, at 30 arc-second resolution, were summarized by centroid at ISEA3H resolutions 5 to 10, and by area-weighted mean at ISEA3H resolutions 6 to 9. Codes, descriptions, and units for the bioclimatic variables are listed in Table 5.Geol10ogy

    DSMW. The Digital Soil Map of the World (DSMW)65 describes the geographic distribution and physical and chemical properties of the world’s soils. The DSMW was digitized from the FAO-UNESCO Soil Map of the World, printed at 1:5,000,000 scale. Each digitized mapping unit is assigned a number of soil attributes; here we classify units via the DOMSOI attribute, the dominant soil or land unit code. The DSMW includes 117 soils in 26 major soil groupings, as well as six other land units, for a total of 123 DOMSOI classes. The source vector dataset was summarized by fraction and mode at ISEA3H resolutions 5, 6, and 9. Variable codes and descriptions are listed in Table 6.Table 6 Codes and descriptions for the 123 DSMW soil and land units, from the FAO65.Full size table

    GLiM. The Global Lithological Map (GLiM)66 represents the rock and unconsolidated sediments at or near the Earth’s terrestrial surface; this geological material is a source of geochemical flux to the Earth’s soils, biosphere, and hydrosphere. Hartmann and Moosdorf66 compiled the map and accompanying database from 92 regional geological maps and 318 literature sources. Rock was classified into 16 first-level lithological classes; 12 second-level and 14 third-level subclasses further describe specific mineralogical and physical properties.

    The source vector dataset was summarized by centroid at ISEA3H resolution 9. Variable codes and descriptions are listed in Table 7. The attribute assigned each ISEA3H cell takes the form xxyyzz; underscore characters (_) in the yy and/or zz slots indicate the second- and/or third-level subclasses were undefined.Table 7 Codes and descriptions for the 16 first-level, 12 second-level, and 14 third-level GLiM lithological classes, from Hartmann and Moosdorf66.Full size table

    Sedimentary Basins. Sedimentary basins are areas of subsidence in the Earth’s crust, in which sediments eroded from uplands are deposited and potentially preserved for a million or more years67, thus entering the planet’s long-term geological record. Nyberg and Howell67 delineate active sedimentary basins, covering both the Earth’s terrestrial surface and marine areas over continental crust. The authors operationally defined basins as low-relief areas containing Quaternary Period sediments, and further classified the basins by tectonic setting, identifying backarc, forearc, foreland, extensional, intracratonic, passive margin, and strike-slip basins on the basis of published literature and geological maps67.

    Terrestrial basins were incorporated in the Eco-ISEA3H database37. Note that no terrestrial backarc basins were delineated. The source vector dataset was summarized by fraction and mode at ISEA3H resolution 9.Human geography

    GPW. Human population density is one of several measures of human presence and activity which together define the human “footprint,” associated with profound, adverse effects on natural systems78. Given this pervasive impact, data characterizing degree of human influence are used as predictors in some ecological models, including SDMs28. The Gridded Population of the World (GPW)68 density dataset represents the global distribution of human population density, developed using census records, population registers, and the administrative boundaries of approximately 13.5 million national and subnational units. Density, measured by population count per square kilometer, was estimated every five years, from 2000 to 2020, inclusive. The source raster dataset for each year, at 30 arc-second resolution, was summarized by area-weighted mean at ISEA3H resolutions 6 to 9.

    Land cover

    MCD12Q1. The Moderate Resolution Imaging Spectroradiometer (MODIS) land cover type (MCD12Q1) dataset69 describes land cover globally, via six different classification schemes. The Eco-ISEA3H database37 includes land cover classified via the International Geosphere-Biosphere Programme (IGBP) scheme, initially developed for the DISCover dataset79; the IGBP scheme includes 16 land cover classes, 13 natural and three anthropogenically modified. The MCD12Q1 dataset is derived from reflectance data collected by the MODIS instruments aboard the Terra and Aqua satellites; the two instruments observe the entirety of the Earth’s surface every one to two days, recording reflectance in 36 spectral bands.

    MCD12Q1 land cover is estimated annually. For each year, reflectance time-series data are smoothed and gap-filled via smoothing splines; derived spectro-temporal features are used as input to a random forest classifier; and output land cover classifications are post-processed, to incorporate prior knowledge and reduce inter-annual variability69. The source raster dataset for 2001 and 2014 to 2018, inclusive, at approximately 500 meter resolution, was summarized by centroid, fraction, and mode at ISEA3H resolutions 5 to 10. Variable codes and descriptions are listed in Table 8.Table 8 Codes and descriptions for the 16 IGBP land cover classes, from Friedl and Sulla-Menashe69.Full size tableMOD44B. The MODIS vegetation continuous fields (VCF) dataset (MOD44B)70 describes global land cover quantitatively, as fractions of three cover components: tree canopy, non-tree canopy, and non-vegetated, barren cover. Note that canopy cover, as defined here, indicates the area over which light is intercepted; this differs from crown cover, which indicates the area covered by a plant’s crown regardless of light interception/penetration. The MOD44B dataset is derived from reflectance data collected by the MODIS instrument aboard the Terra satellite; for each annual VCF estimate, reflectance time-series data are used as input to a bagged ensemble of linear regression trees70. The source raster dataset for 2018, at approximately 250 meter resolution, was summarized by area-weighted mean at ISEA3H resolution 9.Physical geography

    Biogeographic Realms. As defined by Olson et al.56, the eight terrestrial biogeographic realms are the broadest divisions of the Earth’s terrestrial flora and fauna; these may be further subdivided into biomes and ecoregions, the latter containing distinct natural communities. Olson et al.56 developed this hierarchical system primarily for global and regional conservation planning. Realm, biome, and ecoregion delineations are based on expert knowledge, contributed by more than 1,000 scientists working in relevant fields; these divisions thus incorporate knowledge of endemic taxa, unique species assemblages, and local geological and biogeographical history56. Realms were included in the Eco-ISEA3H database37 to provide a high-level classification of the Earth’s biogeography, from a source frequently cited in the scientific literature. The source vector dataset was summarized by fraction and mode at ISEA3H resolutions 5 to 9. Variable codes and descriptions are listed in Table 9.Table 9 Codes and descriptions for the eight biogeographic realms, from Olson et al.56.Full size table

    ENVIREM. In addition to the climatic variables discussed previously, the ENVIREM dataset60 contains two topographic variables, derived from SRTM30_PLUS. These two indices characterize terrain roughness, a measure of variability in local elevation; and topographic wetness, a function of slope and upgradient contributing area. Source rasters at 30 arc-second resolution were summarized by area-weighted mean at ISEA3H resolutions 8 and 9. Variable codes, descriptions, and units are listed in Table 10.Table 10 Codes, descriptions, and units for the two ENVIREM topographic variables, from Title and Bemmels60.Full size table

    GLWD. The Global Lakes and Wetlands Database (GLWD)71, Level 3, represents the maximum extent of lakes, reservoirs, rivers, and a number of wetland types, comprising 12 waterbody classes in total. Lehner and Döll71 compiled the three levels of the GLWD by combining seven source map and attribute datasets, and suggest Level 3 may be useful as input in global hydrologic and climatic modeling. The source raster dataset at 30 arc-second resolution was summarized by fraction and mode at ISEA3H resolution 9. Variable codes and descriptions are listed in Table 11.Table 11 Codes and descriptions for the 12 GLWD waterbody classes, from Lehner and Döll71.Full size table

    Natural Earth. Natural Earth is a public-domain collection of raster and vector datasets developed for production cartography. Three vector themes describing physical geography were incorporated: Land, which includes continents and major islands; Islands, which includes additional minor islands; and Lakes, which includes lakes and reservoirs. Source vector datasets at 1:10,000,000 scale were summarized by fraction at ISEA3H resolutions 5 to 9. Further, fractions for a Terra theme were calculated, by adding per-cell Land and Islands, and subtracting Lakes. The Terra theme may be thresholded (for example, at a fraction value ≥0.5) to identify terrestrial ISEA3H cells, excluding cells covered primarily by ocean or freshwater habitat.

    SRTM30_PLUS. The SRTM30_PLUS dataset72 is a global digital elevation model (DEM), representing the Earth’s terrestrial topography and ocean bathymetry. A number of elevation sources were incorporated in developing the DEM; terrestrial topography was derived from the Shuttle Radar Topography Mission (SRTM) at latitudes between ±60°, from GTOPO30 in the Arctic, and from GLAS/ICESat in the Antarctic. Ocean bathymetry was derived from satellite radar altimetry, calibrated on 298 million corrected ship-based depth soundings, gathered from several sounding sources72. The source raster dataset at 30 arc-second resolution was summarized by area-weighted mean at ISEA3H resolutions 6 to 10.

    Species rangesFrom the Red List and the Phylogenetic Atlas, the geographic ranges of species belonging to four mammalian orders were sampled: Artiodactyla (even-toed ungulates), Perissodactyla (odd-toed ungulates), Primates, and Proboscidea (elephants). These species are primarily large-bodied herbivores, and as such are frequently the subject of dental ecometrics research; for example, averaged dental traits of communities of these mammals have been used to predict measures of local precipitation, at both global3 and regional11 scales.

    IUCN Red List. The International Union for Conservation of Nature’s (IUCN) Red List of Threatened Species50 comprises global assessments of the conservation status of nearly 150,000 floral, faunal, and fungal species. The Red List includes expert-delineated geographic ranges for most of these species, including most extant mammalian species. For each species, portions of the range for which the species’ presence was coded extant, and for which its origin was coded native or reintroduced, were sampled. Source vector datasets were summarized by fraction at ISEA3H resolutions 8 to 9 (Artiodactyla and Perissodactyla), 9 (Primates), and 7 to 9 (Proboscidea).

    PHYLACINE. The Phylogenetic Atlas of Mammal Macroecology (PHYLACINE)73,74 includes trait, phylogeny, and geographic range data for all mammalian species known from the last interglacial period (approximately 130,000 years ago) to the present, both extant and recently extinct. PHYLACINE includes species’ ranges under two scenarios, both of which were incorporated: present-day ranges, from the IUCN v2016.3; and “present-natural” ranges, for which each species’ present-day range was modified to estimate its distribution under current climatic conditions, but absent anthropogenic pressure. This included, among eight modification categories, reconnecting fragmented ranges, by filling suitable intervening habitat; and expanding ranges reduced by human activity, by filling suitable adjacent habitat. Present-natural range modifications are documented for each species in PHYLACINE’s metadata, and intended to mitigate human impact on the results of macroecological analysis and modeling. Source rasters at approximately 100 kilometer resolution were summarized by centroid at ISEA3H resolution 9. More

  • in

    Globally invariant metabolism but density-diversity mismatch in springtails

    Data reportingThe data underpinning this study is a compilation of existing datasets and therefore, no statistical methods were used to predetermine sample size, the experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. The measurements were taken from distinct samples, repeated measurements from the same sites were averaged in the main analysis.Inclusion & ethicsData were primarily collected from individual archives of contributing co-authors. The data collection initiative was openly announced via the mailing list of the 10th International Seminar on Apterygota and via social media (Twitter, Researchgate). In addition, colleagues from less explored regions (Africa, South America) were contacted via personal networks of the initial authors group and literature search. All direct data providers who collected and standardised the data were invited as co-authors with defined minimum role (data provision and cleaning, manuscript editing and approval). For unpublished data, people who were directly involved in sorting and identification of springtails, including all local researchers, were invited as co-authors. Principal investigators were normally not included as co-authors, unless they contributed to conceptualisation and writing of the manuscript. All co-authors were informed and invited to contribute throughout the research process—from the study design and analysis to writing and editing. The study provided an inclusive platform for researchers around the globe to network, share and test their research ideas.Data acquisitionBoth published and unpublished data were collected, using raw data whenever possible entered into a common template. In addition, data available from Edaphobase47 was included. The following minimum set of variables was collected: collectors, collection method (including sampling area and depth), extraction method, identification precision and resources, collection date, latitude and longitude, vegetation type (generalized as grassland, scrub, woodland, agriculture and other for the analysis), and abundances of springtail taxa found in each soil sample (or sampling site). Underrepresented geographical areas (Africa, South America, Australia and Southeast Asia) were specifically targeted by a literature search in the Web of Science database using the keywords ‘springtail’ or ‘Collembola’, ‘density’ or ‘abundance’ or ‘diversity’, and the region of interest; data were acquired from all found papers if the minimum information listed above was provided. All collected datasets were cleaned using OpenRefine v3.3 (https://openrefine.org) to remove inconsistencies and typos. Geographical coordinates were checked by comparing the dataset descriptions with the geographical coordinates. In total, 363 datasets comprising 2783 sites were collected and collated into a single dataset (Supplementary Fig. 1).Calculation of community parametersCommunity parameters were calculated at the site level. Here, we defined a site as a locality that hosts a defined springtail community, is covered by a certain vegetation type, with a certain management, and is usually represented by a sampling area of up to a hundred metres in diameter, making species co-occurrence and interactions plausible. To calculate density, numerical abundance in all samples was averaged and recalculated per square metre using the sampling area. Springtail communities were assessed predominantly during active vegetation periods (i.e., spring, summer and autumn in temperate and boreal biomes, and summer in polar biomes). Our estimations of community parameters therefore refer to the most favourable conditions (peak yearly densities). This seasonal sampling bias is likely to have little effect on our conclusions, since most springtails survive during cold periods38,48. Finally, we used mean annual soil temperatures49 to estimate the seasonal mean community metabolism (described below) and tested for the seasonal bias in additional analysis (see Linear mixed-effects models).All data analyses were conducted in R v. 4.0.250 with RStudio interface v. 1.4.1103 (RStudio, PBC). Data was transformed and visualised with tidyverse packages51,52, unless otherwise mentioned. Background for the global maps was acquired via the maps package53,54. To calculate local species richness, we used data identified to species or morphospecies level (validated by the expert team). Since the sampling effort varied among studies, we extrapolated species richness using rarefaction curves based on individual samples with the Chao estimator51,52 in the vegan package53. For some sites, sample-level data were not available in the original publications, but site-level averages were provided, and an extensive sampling effort was made. In such cases, we predicted extrapolated species richness based on the completeness (ratio of observed to extrapolated richness) recorded at sites where sample-level data were available (only sites with 5 or more samples were used for the prediction). We built a binomial model to predict completeness in sites where no sample-level data were available using latitude and the number of samples taken at a site as predictors: glm(Completeness~N_samples*Latitude). We found a positive effect of the number of samples (Chisq = 1.97, p = 0.0492) and latitude (Chisq = 2.07, p = 0.0391) on the completeness (Supplementary Figs. 17–19). We further used this model to predict extrapolated species richness on the sites with pooled data (435 sites in Europe, 15 in Australia, 6 in South America, 4 in Asia, and 3 in Africa).To calculate biomass, we first cross-checked all taxonomic names with the collembola.org checklist55 using fuzzy matching algorithms (fuzzyjoin R package56) to align taxonomic names and correct typos. Then we merged taxonomic names with a dataset on body lengths compiled from the BETSI database57, a personal database of Matty P. Berg, and additional expert contributions. We used average body lengths for the genus level (body size data on 432 genera) since data at the species level were not available for many morphospecies (especially in tropical regions), and species within most springtail genera had similar body size ranges. Data with no genus-level identifications were excluded from the analysis. Dry and fresh body masses were calculated from body length using a set of group-specific length-mass regressions (Supplementary Table 1)58,59 and the results of different regressions applied to the same morphogroup were averaged. Dry mass was recalculated to fresh mass using corresponding group-specific coefficients58. We used fresh mass to calculate individual metabolic rates60 and account for the mean annual topsoil (0–5 cm) temperature at a given site61. Group-specific metabolic coefficients for insects (including springtails) were used for the calculation: normalization factor (i0) ln(21.972) [J h−1], allometric exponent (a) 0.759, and activation energy (E) 0.657 [eV]60. Community-weighted (specimen-based) mean individual dry masses and metabolic rates were calculated for each sample and then averaged by site after excluding 10% of maximum and 10% of minimum values to reduce impact of outliers. To calculate site-level biomass and community metabolism, we summed masses or metabolic rates of individuals, averaged them across samples, and recalculated them per unit area (m2).Parameter uncertaintiesOur biomass and community metabolism approximations contain several assumptions. To account for the uncertainty in the length-mass and mass-metabolism regression coefficients, in addition to the average coefficients, we also used maximum (average + standard error) and minimum coefficients (average—standard error; Supplementary Table 1) in all equations to calculate maximum and minimum estimations of biomass and community metabolism reported in the main text. Further, we ignored latitudinal variation in body sizes within taxonomic groups62. Nevertheless, latitudinal differences in springtail density (30-fold), environmental temperature (from −16.0 to +27.6 °C in the air and from −10.2 to +30.4 °C in the soil), and genus-level community compositions (there are only few common genera among polar regions and the tropics)55 are higher than the uncertainties introduced by indirect parameter estimations, which allowed us to detect global trends. Although most springtails are concentrated in the litter and uppermost soil layers20, their vertical distribution depends on the particular ecosystem63. Since sampling methods are usually ecosystem-specific (i.e. sampling is done deeper in soils with developed organic layers), we treated the methods used by the original data collectors as representative of a given ecosystem. Under this assumption, we might have underestimated the number of springtails in soils with deep organic horizons, so our global estimates are conservative and we would expect true global density and biomass to be slightly higher. To minimize these effects, we excluded sites where the estimations were likely to be unreliable (see data selection below).Data selectionOnly data collection methods allowing for area-based recalculation (e.g. Tullgren or Berlese funnels) were used for analysis. Data from artificial habitats, coastal ecosystems, caves, canopies, snow surfaces, and strong experimental manipulations beyond the bounds of naturally occurring conditions were excluded (Supplementary Fig. 1). To ensure data quality, we performed a two-step quality check: technical selection and expert evaluation. Collected data varied according to collection protocols, such as sampling depth and the microhabitats (layers) considered. To technically exclude unreliable density estimations, we explored data with a number of diagnostic graphs (Supplementary Table 2; Supplementary Figs. 12–20) and filtered it, excluding the following: (1) All woodlands where only soil or only litter was considered; (2) All scrub ecosystems where only ground cover (litter or mosses) was considered; (3) Agricultural sites in temperate zones where only soil with sampling depth 90% of cases were masked on the main maps; for the map with density-species richness visualisation, two corresponding masks were applied (Fig. 2).To estimate spatial variability of our predictions while accounting for the spatial sampling bias in our data (Fig. 1a) we performed a spatially stratified bootstrapping procedure. We used the relative area of each IPBES79 region (i.e., Europe and Central Asia, Asia and the Pacific, Africa, and the Americas) to resample the original dataset, creating 100 bootstrap resamples. Each of these resamples was used to create a global map, which was then reduced to create mean, standard deviation, 95% confidence interval, and coefficient of variation maps (Supplementary Figs. 4–7).Global biomass, abundance, and community metabolism of springtails were estimated by summing predicted values for each 30 arcsec pixel10. Global community metabolism was recalculated from joule to mass carbon by assuming 1 kg fresh mass = 7 × 106 J80, an average water proportion in springtails of 70%58, and an average carbon concentration of 45% (calculated from 225 measurements across temperate forest ecosystems)81. We repeated the procedure of global extrapolation and prediction for biomass and community metabolism using minimum and maximum estimates of these parameters from regression coefficient uncertainties (see Parameter uncertainties).Path analysisTo reveal the predictors of springtail communities at the global scale, we performed a path analysis. After filtering the selected environmental variables (see above) according to their global availability and collinearity, 13 variables were used (Supplementary Fig. 9b): mean annual air temperature, mean annual precipitation (CHELSA database67), aridity (CGIAR database68), soil pH, sand and clay contents combined (sand and clay contents were co-linear in our dataset), soil organic carbon content (SoilGrids database73), NDVI (MODIS database72), human population density (GPWv4 database74), latitude, elevation69, and vegetation cover reported by the data providers following the habitat classification of European Environment Agency (woodland, scrub, agriculture, and grasslands; the latter were coded as the combination of woodland, scrub, and agriculture absent). Before running the analysis, we performed the Rosner’s generalized extreme Studentized deviate test in the EnvStats package82 to exclude extreme outliers and we z-standardized all variables (Supplementary R Code).Separate structural equation models were run to predict density, dry biomass, community metabolism, and local species richness in the lavaan package83. To account for the spatial clustering of our data in Europe, instead of running a model for the entire dataset, we divided the data by the IPBES79 geographical regions and selected a random subset of sites for Eurasia, such that only twice the number of sites were included in the model as the second-most represented region. We ran the path analysis 99 times for each community parameter with different Eurasian subsets (density had n = 723 per iteration, local species richness had n = 352, dry biomass had n = 568, and community metabolism had n = 533). We decided to keep the share of the Eurasian dataset larger than other regions to increase the number of sites per iteration and validity of the models. The Eurasian dataset also had the best data quality among all regions and a substantial reduction in datasets from Eurasia would result in a low weight for high-quality data. We additionally ran a set of models in which the Eurasian dataset was represented by the same number of sites as the second-most represented region, which yielded similar effect directions for all factors, but slightly higher variations and fewer consistently significant effects. In the paper, only the first version of analysis is presented. To illustrate the results, we averaged effect sizes for the paths across all iterations and presented the distribution of these effect sizes using mirrored Kernel density estimation (violin) plots. We marked and discussed effects that were significant at p  More

  • in

    Co-cultivation of Mortierellaceae with Pseudomonas helmanticensis affects both their growth and volatilome

    The growth behaviour of Linnemannia is strain-specificMost strains showed comparable morphological characteristics on both media as well as in pure and co-culture. However, Linnemannia solitaria and Entomortierella galaxiae produced more aerial mycelium on PDA compared to LcA. There was more/less aerial mycelium in co-cultures with P. helmanticensis compared to pure cultures depending on the strain (Fig. 1, SI Fig. S3).The comparison of Linnemannia and E. galaxiae daily radial growth rates did not support a difference between these genera (p ≥ 0.3). The overall linear model indicated that the fungal daily growth rates mainly differed among species (Table 1). In addition, the effect of strains highlighted the heterogeneity among strains within species (Fig. 2, SI Figs. S4, S5). Although there was no relevant main effect of medium on the daily radial growth rate of the fungi, the medium did affect the fungi in a strain-specific manner (Table 1, Fig. 2, SI Figs. S4, S5). On nutrient poor LcA, the fungal daily radial growth rates were reduced for all species, except for L. solitaria, which grew better on LcA (SI Figs. S3, S4).Table 1 The effect of experimental factors on the fungal daily radial growth rate.Full size tableFigure 2Daily radial growth rate of pure Linnemannia and Entomortierella cultures as well as co-cultures with P. helmanticensis on nutrient rich PDA medium. (a) L. exigua, (b) L. gamsii, (c) L. hyalina, (d) L. sclerotiella, (e) L. solitaria, (f) E. galaxiae.Full size imageThe main effect of co-plating P. helmanticensis on radial growth rate was small, yet significant (0.7%, p  More

  • in

    This baby turtle surprised scientists by swimming against the current

    In 2008, I had just begun volunteering at Equilibrio Azul — a non-profit marine-research and -conservation organization based in Quito — when colleagues discovered a hawksbill sea turtle (Eretmochelys imbricata) nesting at La Playita beach in Ecuador. The eastern Pacific population of hawksbill sea turtles is one of the most endangered in the world and was considered functionally extinct in the region before this turtle and others were observed.That discovery was a tipping point for hawksbill research in Ecuador and throughout the Pacific Ocean. Since 2008, we’ve found about 20 nests each year at La Playita, and one season, we documented 50.We have tagged 11 adult females with satellite transmitters. Previously, most of our understanding of these turtles had been based on observations in the Caribbean, where the reptiles are strictly coral-reef dwellers. But Ecuador’s reefs are mostly rocky, with patches of coral, and we were surprised to see females migrate south to mangroves, mainly for food.
    Women in science
    In this image, we have just attached a transmitter to a baby turtle — a first for hawksbill turtles this young and in the eastern Pacific region. We did not know much about hawksbills at this young age. It is tricky working with baby turtles, because they grow very fast, and the transmitters, which give us location data, can easily fall off. We’ve used cement to glue the devices to the shells of six newborns so far. The longest the transmitters have lasted is three months and the shortest period was only six weeks — but the devices provided our first insights into the ‘lost years’ of sea-turtle biology.Our findings have overturned assumptions that neonates were just carried along by currents. Instead, we found that one-day-old turtles can swim against the current. They aim for a specific direction — north by northwest — as they learn to dive and swim. We tracked one-year-old hawksbills to Costa Rican waters, a journey of roughly 2,000 kilometres, before we lost their signal.Cristina Miranda is a scientific coordinator at Equilibrio Azul in Quito, Ecuador. Interview by Virginia Gewin. More

  • in

    Rapid diversification underlying the global dominance of a cosmopolitan phytoplankton

    Genetic and morphological delineation between G. huxleyi strainsWe first assessed genetic variability through analysis of genomic polymorphism to determine whether distinct genetic lineages exist in G. huxleyi and to test whether these relate to morphotypes. We used 2,086,643 high-quality biallelic single nucleotide polymorphisms (SNPs) retrieved from the 47 clonal culture strains with the best genome sequence coverage ( >20×). A principal component analysis (PCA) and a discriminant analysis in principal component (DAPC) both delineate three well-defined genetic groups, with the distribution of strains being unequal and with no overlap on the principal components (Fig. 1a; Supplementary Fig. S3a,b). With regards to population structure, the DAPC analysis suggested that 3 clusters (K = 3) can be used to depict a genotype membership matrix for each strain (Fig. 1b; Supplementary Fig. S4). As such, it confirmed the three-lineage delineation proposed by the PCA, while illustrating no admixture between lineages.Fig. 1: Relationship between genetic structure and morphotypes in G. huxleyi.a Principal component analysis (PCA) based on 2,086,643 SNPs recovered from 47 G. huxleyi genomes; b Relationship between coalescent species phylogeny (ASTRAL tree based on 1000 supergenes) and DAPC clustering; c Correspondence between morphotypes and lineages within G. huxleyi, and sub-lineages within A1 (scale bar = 4 μm). Variable elements in relation to genotypes are highlighted in the schematics under the SEM pictures; d Distribution of coccolith length for 5 randomly chosen strains representing each clade and sub-clade, with a jittered box-plot on the left and a half-violin plot on the right for each group; e Matrix plot of Bonferroni corrected p-value corresponding to the Dunn-test for the comparison of coccolith length measurements between groups.Full size imagePhylogenetic inference based on alignments with higher mapping coverage only (47 strains) or including sequences with lower mapping coverage (59 strains) all supported segregation of strains into three main lineages, which we term clades A1, A2 and B, with A1 and A2 being more closely related to each other than to B (Fig. 1b; Supplementary Fig. S5a, b). This delineation is congruent with previous studies on the phylogeny of the Gephyrocapsa genus [17, 46, 65]. These clades also correspond to differences in morphotypes (Fig. 1b, c). All strains in clade A1 produce unambiguous A-group coccolith morphotypes (type A and type R). Similarly, all strains in clade B produce unambiguous B-group coccolith morphotypes (type B and type O). Clade A2 is less distinctive, with strains producing lightly calcified type A coccoliths. Some of these strains could be classified as type B/C [66] or C (both regarded as B-group morphotypes), but distinctive by the lower elevation of distal shield elements and by greater degree of calcification of the central area grid (which is reduced and sometimes absent in morphotypes B/C and C). At a finer level, clade A1 is composed of four sub-clades, which we term A1a, A1b, A1c, and A1d. Strains in sub-clades A1a, A1c and A1d all produce coccoliths with type A morphologies and distinctive degrees of calcification: strains in the sub-clade A1a form relatively lightly calcified coccoliths with regular elements, while strains in sub-clades A1c and A1d produce similar moderately calcified coccoliths, sometimes with conspicuous irregularities (inner tube elements overlapping into the central area). Strains in clade A1b produce distinct coccoliths exhibiting A-group morphology but with heavy calcification, including forms with heavily calcified shields which have been termed type R and also forms with heavily calcified central areas which have been referred to as “type A overcalcified”. Some clade A2 strains produce coccoliths with a similar morphology to strains in A1a, indicative of partially cryptic lineages (Supplementary Fig. S2; Supplementary Table S4).The congruence between morphotypes and clades is also supported by significant differences in the length of coccoliths measured between some of the clades (Fig. 1d, e). The morphogroups A and B differ significantly, and insignificant comparison relates to the comparison of sub-clades against the clade A2, which reinforces the closest relationship between A1 and A2. We denote also that the case of A1a and A2 demonstrating no significant difference in coccolith length concurs with the cryptic delineation mentioned above.Based on the clustering analyses and the phylogenetic reconstructions, we tested whether different groupings are distinct species with regards to the null hypothesis “G. huxleyi is a single species”, which correspond to the current state of taxonomy. Species delimitation based on comparison of Marginal Likelihood Estimators (MLE) with Bayes Factors (BF) supported the hypothesis that the three lineages depicted by ordination and phylogenetic reconstructions are distinct species as the best model (Table 1).Table 1 Species delimitation based on Bayes Factor Delimitation (BDF).Full size tableD-statistics calculated to estimate gene flow reveal a non-significant excess of alleles shared between the three lineages (Fig. 2a; Supplementary Table S5). Fbranch statistics, (fb) revealed significant signatures of gene-flow between sub-lineages within A1 associated with correlated estimates in relation to A1a, A2 and B (Fig. 2a) [60]. Signatures on the basal branch of diversification in A1 may correspond to genetic exchanges between A1 and B, with gene-flow signatures attributed to A2 corresponding to correlated estimates due to common ancestry. Recent signatures of gene-flow throughout the evolution of A1 are thus likely associated to the common ancestry between A1a, A2 and B during gene-flow events between the sub-lineages, as supported by the non-significant D statistics between the three lineages. Moreover, the phylogenetic network revealed similar convolutions between A1 sub-lineages but clear separation of the main lineages and longer branches in the A2 lineage (Fig. 2b).Fig. 2: Excess of allele sharing and differentiation in G. huxleyi.a f-branch (fb) statistics between lineages and sub-lineages. The gradient represents the fb score, grey blocks represents tests not consistent with the species tree (for each branch on the topology of the y axis, having itself or a sister taxon as donor on the topology of the x axis); asterisks denote block jack-knifing significance at p  More

  • in

    Enhancing the ecological value of oil palm agriculture through set-asides

    Phalan, B. et al. Crop expansion and conservation priorities in tropical countries. PLoS ONE 8, e51759 (2013).Article 
    CAS 

    Google Scholar 
    Poore, J. & Nemecek, T. Reducing food’s environmental impacts through producers and consumers. Science 360, 987–992 (2018).Article 
    CAS 

    Google Scholar 
    Springmann, M. et al. Options for keeping the food system within environmental limits. Nature 562, 519–525 (2018).Tilman, D., Balzer, C., Hill, J. & Befort, B. L. Global food demand and the sustainable intensification of agriculture. Proc. Natl Acad. Sci. USA 108, 20260–20264 (2011).Article 
    CAS 

    Google Scholar 
    Searchinger, T. D., Wirsenius, S., Beringer, T. & Dumas, P. Assessing the efficiency of changes in land use for mitigating climate change. Nature 564, 249–253 (2018).Edwards, D. P. et al. Conservation of tropical forests in the Anthropocene. Curr. Biol. 29, R1008–R1020 (2019).Article 
    CAS 

    Google Scholar 
    Newbold, T. et al. Global patterns of terrestrial assemblage turnover within and among land uses. Ecography 39, 1151–1163 (2016).Article 

    Google Scholar 
    Newbold, T. et al. Global effects of land use on local terrestrial biodiversity. Nature 520, 45–50 (2015).Article 
    CAS 

    Google Scholar 
    Gibbs, H. K. et al. Tropical forests were the primary sources of new agricultural land in the 1980s and 1990s. Proc. Natl Acad. Sci. USA 107, 16732–16737 (2010).Hansen, M. C. et al. High-resolution global maps of 21st-century forest cover change. Science 342, 850–853 (2013).Article 
    CAS 

    Google Scholar 
    Newbold, T. et al. A global model of the response of tropical and sub-tropical forest biodiversity to anthropogenic pressures. Proc. R. Soc. B 281, 20141371 (2014).Article 

    Google Scholar 
    Clough, Y. et al. Combining high biodiversity with high yields in tropical agroforests. Proc. Natl Acad. Sci. USA 108, 8311–8316 (2011).Article 
    CAS 

    Google Scholar 
    Giam, X. Global biodiversity loss from tropical deforestation. Proc. Natl Acad. Sci. USA 114, 5775–5777 (2017).Article 
    CAS 

    Google Scholar 
    van der Werf, G. R. et al. CO2 emissions from forest loss. Nat. Geosci. 2, 737–738 (2009).Article 

    Google Scholar 
    Harvey, C. A. et al. Climate‐smart landscapes: opportunities and challenges for integrating adaptation and mitigation in tropical agriculture. Conserv. Lett. 7, 77–90 (2014).Article 

    Google Scholar 
    Harris, N. L. et al. Baseline map of carbon emissions from deforestation in tropical regions. Science 336, 1573–1576 (2012).Article 
    CAS 

    Google Scholar 
    Song, X.-P. et al. Global land change from 1982 to 2016. Nature 560, 639–643 (2018).Quezada, J. C., Etter, A., Ghazoul, J., Buttler, A. & Guillaume, T. Carbon neutral expansion of oil palm plantations in the Neotropics. Sci. Adv. 5, eaaw4418 (2019).Article 
    CAS 

    Google Scholar 
    Oil Palm and Biodiversity: a Situation Analysis by the IUCN Oil Palm Task Force (International Union for Conservation of Nature, 2018). https://doi.org/10.2305/IUCN.CH.2018.11.enMeijaard, E. & Sheil, D. The moral minefield of ethical oil palm and sustainable development. Front. For. Glob. Change 2, 22 (2019).The Future of Food and Agriculture – Alternative Pathways to 2050 (FAO, 2018).Henders, S., Persson, U. M. & Kastner, T. Trading forests: land-use change and carbon emissions embodied in production and exports of forest-risk commodities. Environ. Res. Lett. 10, 125012 (2015).Article 

    Google Scholar 
    Donofrio, S., Rothrock, P. & Leonard, J. Supply Change: Tracking Corporate Commitments to Deforestation-Free Supply Chains (Forest Trends, 2017).Terrenoire, E., Hauglustaine, D. A., Gasser, T. & Penanhoat, O. The contribution of carbon dioxide emissions from the aviation sector to future climate change. Environ. Res. Lett. 14, 084019 (2019).Article 
    CAS 

    Google Scholar 
    Parsons, S., Raikova, S. & Chuck, C. J. The viability and desirability of replacing palm oil. Nat. Sustain. 3, 412–418 (2020).Article 

    Google Scholar 
    Taheripour, F., Hertel, T. W. & Ramankutty, N. Market-mediated responses confound policies to limit deforestation from oil palm expansion in Malaysia and Indonesia. Proc. Natl Acad. Sci. USA 116, 19193–19199 (2019).Article 
    CAS 

    Google Scholar 
    Laurance, W. F. et al. Improving the performance of the roundtable on sustainable palm oil for nature conservation. Conserv. Biol. 24, 377–381 (2010).Article 

    Google Scholar 
    Meijaard, E., Abrams, J. F., Juffe-Bignoli, D., Voigt, M. & Sheil, D. Coconut oil, conservation and the conscientious consumer. Curr. Biol. 30, R757–R758 (2020).Article 
    CAS 

    Google Scholar 
    Driving Change With Sustainable Palm Oil (Roundtable on Sustainable Palm Oil, accessed August 2022). https://rspo.org/aboutGarrett, R. D., Carlson, K. M., Rueda, X. & Noojipady, P. Assessing the potential additionality of certification by the round table on responsible soybeans and the roundtable on sustainable palm oil. Environ. Res. Lett. 11, 045003 (2016).Article 

    Google Scholar 
    Mittermeier, R. A., Myers, N., Mittermeier, C. G. & Robles, G. Hotspots: Earth’s Biologically Richest and Most Endangered Terrestrial Ecoregions (Conservation International, 1999).Gaveau, D. L. et al. Rapid conversions and avoided deforestation: examining four decades of industrial plantation expansion in Borneo. Sci. Rep. 6, 32017 (2016).Article 
    CAS 

    Google Scholar 
    Luke, S. H. et al. Riparian buffers in tropical agriculture: scientific support, effectiveness and directions for policy. J. Appl. Ecol. 56, 85–92 (2019).Article 

    Google Scholar 
    Mitchell, S. L. et al. Riparian reserves help protect forest bird communities in oil palm dominated landscapes. J. Appl. Ecol. 55, 2744–2755 (2018).Article 

    Google Scholar 
    Scriven, S. A. et al. Testing the benefits of conservation set-asides for improved habitat connectivity in tropical agricultural landscapes. J. Appl. Ecol. 56, 2274–2285 (2019).Article 

    Google Scholar 
    Deere, N. J. et al. Riparian buffers can help mitigate biodiversity declines in oil palm agriculture. Front. Ecol. Environ. 20, 459–466 (2021).Woodham, C. R. et al. Effects of replanting and retention of mature oil palm riparian buffers on ecosystem functioning in oil palm plantations. Front. Glob. Change 2, 29 (2019).Article 

    Google Scholar 
    Carlson, K. M. et al. Influence of watershed‐climate interactions on stream temperature, sediment yield, and metabolism along a land use intensity gradient in Indonesian Borneo. J. Geophys. Res. Biogeosci. 119, 1110–1128 (2014).Article 

    Google Scholar 
    Carlson, K. M. et al. Effect of oil palm sustainability certification on deforestation and fire in Indonesia. Proc. Natl Acad. Sci. USA 115, 121–126 (2018).Article 
    CAS 

    Google Scholar 
    Fleiss, S. et al. Conservation set-asides improve carbon storage and support associated plant diversity in certified sustainable oil palm plantations. Biol. Conserv. 248, 108631 (2020).Article 

    Google Scholar 
    Wunder, S., Angelsen, A. & Belcher, B. Forests, livelihoods, and conservation: broadening the empirical base. World Dev. 64, S1–S11 (2014).Struebig, M. J. et al. Quantifying the biodiversity value of repeatedly logged rainforests: gradient and comparative approaches from Borneo. Adv. Ecol. Res. 48, 183–224 (2013).Article 

    Google Scholar 
    Shevade, V. S. & Loboda, T. V. Oil palm plantations in Peninsular Malaysia: determinants and constraints on expansion. PLoS ONE 14, e0210628 (2019).Article 
    CAS 

    Google Scholar 
    Pirker, J., Mosnier, A., Kraxner, F., Havlík, P. & Obersteiner, M. What are the limits to oil palm expansion? Glob. Environ. Change 40, 73–81 (2016).Article 

    Google Scholar 
    Launching the RSPO Jurisdictional Approach (JA) Piloting Framework (Roundtable on Sustainable Palm Oil, accessed August 2022).Abram, N. K. et al. Synergies for improving oil palm production and forest conservation in floodplain landscapes. PLoS ONE 9, e95388 (2014).Article 

    Google Scholar 
    Othman, N. et al. Shift of paradigm needed towards improving human–elephant coexistence in monoculture landscapes in Sabah. Int. Zoo Yearb. 53, 161–173 (2019).Article 

    Google Scholar 
    Horton, A. J. et al. Can riparian forest buffers increase yields from oil palm plantations? Earths Future 6, 1082–1096 (2018).Article 

    Google Scholar 
    Ewers, R. M. et al. A large-scale forest fragmentation experiment: the Stability of Altered Forest Ecosystems Project. Phil. Trans. R. Soc. Lond. B 366, 3292–3302 (2011).Article 

    Google Scholar 
    Pfeifer, M. et al. Creation of forest edges has a global impact on forest vertebrates. Nature 551, 187–191 (2017).Ewers, R. M., Thorpe, S. & Didham, R. K. Synergistic interactions between edge and area effects in a heavily fragmented landscape. Ecology 88, 96–106 (2007).Article 

    Google Scholar 
    Deere, N. J. et al. High carbon stock forests provide co-benefits for tropical biodiversity. J. Appl. Ecol. 55, 997–1008 (2018).Article 
    CAS 

    Google Scholar 
    Hemprich-Bennett, D. R. et al. Altered structure of bat–prey interaction networks in logged tropical forests revealed by metabarcoding. Mol. Ecol. 30, 5844–5857 (2021).Article 

    Google Scholar 
    Williamson, J. et al. Riparian buffers act as microclimatic refugia in oil palm landscapes. J. Appl. Ecol. 58, 431–442 (2021).Article 

    Google Scholar 
    Slade, E. M., Mann, D. J. & Lewis, O. T. Biodiversity and ecosystem function of tropical forest dung beetles under contrasting logging regimes. Biol. Conserv. 144, 166–174 (2011).Article 

    Google Scholar 
    Gray, R. E. J. et al. Movement of forest-dependent dung beetles through riparian buffers in Bornean oil palm plantations. J. Appl. Ecol. 59, 238–250 (2022).Woodman, S. M. et al. esdm: a tool for creating and exploring ensembles of predictions from species distribution and abundance models. Methods Ecol. Evol. 10, 1923–1933 (2019).Article 

    Google Scholar 
    Liu, C., Berry, P. M., Dawson, T. P. & Pearson, R. G. Selecting thresholds of occurrence in the prediction of species distributions. Ecography 28, 385–393 (2005).Article 

    Google Scholar 
    Piccini, I. et al. Greenhouse gas emissions from dung pats vary with dung beetle species and with assemblage composition. PloS ONE 12, e0178077 (2017).Article 

    Google Scholar 
    Raine, E. H. & Slade, E. M. Dung beetle–mammal associations: methods, research trends and future directions. Proc. R. Soc. B 286, 20182002 (2019).Article 

    Google Scholar 
    Nichols, E., Gardner, T., Peres, C., Spector, S. & Network, S. R. Co‐declining mammals and dung beetles: an impending ecological cascade. Oikos 118, 481–487 (2009).Article 

    Google Scholar 
    Asner, G. P. et al. Mapped aboveground carbon stocks to advance forest conservation and recovery in Malaysian Borneo. Biol. Conserv. 217, 289–310 (2018).Article 

    Google Scholar 
    Jucker, T. et al. Estimating aboveground carbon density and its uncertainty in Borneo’s structurally complex tropical forests using airborne laser scanning. Biogeosciences 15, 3811–3830 (2018).Article 

    Google Scholar 
    Philipson, C. D. et al. Active restoration accelerates the carbon recovery of human-modified tropical forests. Science 369, 838–841 (2020).Article 
    CAS 

    Google Scholar 
    Nunes, M. H. et al. Recovery of logged forest fragments in a human-modified tropical landscape during the 2015–16 El Niño. Nat. Commun. 12, 1526 (2021).Article 
    CAS 

    Google Scholar 
    Woittiez, L. S., van Wijk, M. T., Slingerland, M., van Noordwijk, M. & Giller, K. E. Yield gaps in oil palm: a quantitative review of contributing factors. Eur. J. Agron. 83, 57–77 (2017).Article 

    Google Scholar  More