in

Post-extinction recovery of the Phanerozoic oceans and biodiversity hotspots

Palaeogeographical model

We use palaeogeographical reconstructions describing Earth’s palaeotopography and palaeobathymetry for a series of time slices from 541 Ma to the present day. The reconstructions merge existing models from two published global reconstruction datasets—those of ref. 32 and ref. 33 (https://doi.org/10.5281/zenodo.5348492), which themselves are syntheses of a wealth of previous work.

For continental regions, estimates of palaeoelevation and continental flooding rely on a diverse range of geological evidence, such as sedimentary depositional environments and the spatiotemporal distribution of volcanic activity. For a full description, see a recent review34. Together, these data can be used to define the past locations of mountain ranges and palaeoshorelines34. For this part of our reconstruction, we used the compilation of ref. 33 with updated palaeoshorelines based on depositional environment information in current fossil databases35. This compilation comprises 82 palaeotopography maps covering the entire Phanerozoic. Note that each palaeogeographical map is a time slice representing the concatenation of geological data over several million years36.

We quantified the impact of using the original compilation of ref. 33 on our model results and found only small changes with respect to using the reconstructions with updated palaeoshorelines (Extended Data Fig. 3a–c). Similarly, eustatic sea level is thought to have varied by around 100 m at timescales much shorter than the duration of the time-slices throughout the Phanerozoic37, such that the extent of continental flooding could have varied within each time slice by an amount significant for our analysis. For this reason, and to assess the uncertainty of our results to continental palaeogeography in general, we computed additional maps of continental flooding in which the sea level is raised or lowered by 100 m compared with the original palaeo–digital elevation model grids of ref. 33 (Extended Data Fig. 3d–f).

For deep-ocean regions, the primary control on seafloor depth is the age of the seafloor, so reconstructing palaeobathymetry relies on constructing maps of seafloor age back in time38. As a consequence, we rely on reconstruction models that incorporate a continuous network of plate boundaries. For this part, we used the reconstruction of ref. 32 and derived maps of seafloor age from the plate tectonic model using the method of ref. 39 for which source code is available at GitHub (https://github.com/siwill22/agegrid-0.1). Palaeobathymetry is derived from the seafloor age maps following the steps outlined in ref. 38. It is important to note that seafloor age maps for most of the Phanerozoic (that is, pre-Pangaea times) are not directly constrained by data due to recycling of oceanic crust at subduction zones. Rather, they are model predictions generated by constructing plate motions and plate boundary configurations from the geological and palaeomagnetic record of the continents. Nonetheless, the first-order trends in ocean-basin volume and mean seafloor age are consistent with independent estimates for at least the last 410 million years (Myr)39.

The reconstructions of refs. 32,33 differ in the precise locations of the continents through time. To resolve this discrepancy, we reverse reconstructed the continental palaeoelevation model of ref. 33 to present-day coordinates using their rotation parameters, then reconstructed them back in time using the rotations of ref. 32. Owing to the differences in how the continents are divided into different tectonic units, this process leads to some gaps and overlaps in the results40, which we resolved primarily through a combination of data interpolation and averaging. Manual adjustments were made to ensure that the flooding history remained consistent with the original palaeotopography in areas in which interpolation gives a noticeably different history of seafloor ages. The resulting palaeotopography maps are therefore defined in palaeomagnetic reference frame32 appropriate for use in Earth system models.

For the biodiversity modelling, we generate estimates of the age of the seafloor for discrete points within the oceans and flooded continents, and track these ages through the lifetime of each point (Supplementary Video 5). For the oceans, this is achieved using the method described in ref. 39 in which the seafloor is represented by points that are incrementally generated at the mid-ocean ridges for a series of time steps 1 Myr apart, with each point tracked through subsequent time steps based on Euler poles of rotation until either the present-day is reached, or they arrive at a subduction zone and are considered to be destroyed.

For the continents, tracking the location of discrete points is generally simpler as most crust is conserved throughout the timespan of the reconstruction. In contrast to the deep oceans (where we assume that crust is at all times submerged), we model the ‘age’ of the seafloor from the history of continental flooding and emergence within the palaeogeographical interpretation33. The continents are seeded with uniformly distributed points at the oldest timeslice (541 Ma) at which they are assigned an age of zero. These points are tracked to subsequent time slices of which the palaeogeography is used to determine whether the point lies within a flooded or emergent region. Points within flooded regions of continents are considered to be seafloor, and the age of this seafloor is accumulated across consecutive time slices where a given point lies within a flooded region. When a point is within an emergent region, the seafloor age is reset to zero. Following this approach, individual points within stable continents may undergo several cycles of seafloor age increasing from zero before being reset. At the continental margins formed during the Pangaea breakup, the age of the seafloor continuously grows from the onset of rifting. Intraoceanic island arcs represent an additional case, which can appear as new tectonic units with the reconstructions at various times. In these cases, we assume that the seafloor has a zero-age at the time at which the intraoceanic arc first develops, then remains predominantly underwater for the rest of its lifetime.

Thus, for each of the 82 palaeogeographical reconstructions, we annotate 0.5° by 0.5° grids as continental, flooded continental shelf or oceanic for later use in model coupling and production of regional diversity maps.

Palaeoenvironmental conditions under the cGENIE Earth system model

We use cGENIE41, an Earth system model of intermediate complexity, to simulate palaeoenvironmental conditions of seawater temperature and organic carbon export production (as a surrogate for food supply) throughout the Phanerozoic (from 541 Ma to the present day).

cGENIE is based on a three-dimensional (3D) ocean circulation model coupled to a 2D energy–moisture balance atmospheric component and a sea-ice module. We configured the model on a 36 × 36 (latitude, longitude) equal area grid with 17 unevenly spaced vertical levels in depth, down to a maximum ocean depth of 5,900 m. The cycling of carbon and associated tracers in the ocean is based on a size-structured plankton ecosystem model with a single (phosphate) nutrient42,43, and adopts an Arrhenius-type temperature-dependent scheme for the remineralization of organic matter exported to the ocean interior44.

cGENIE provides a spatially resolved representation of ocean physics and biogeochemistry, which is a prerequisite for the present study to be able to reconstruct the spatial patterns of biodiversity in deep time. However, owing to the computational impracticality of generating a single transient simulation of physics (that is, temperature) and biogeochemistry (that is, export production) over the entire Phanerozoic, we therefore generate 30 model equilibria at regular time intervals throughout the Phanerozoic that are subsequently used as inputs for the regional diversification model (see the ‘Model coupling’ section).

We used 30 Phanerozoic palaeogeographical reconstructions through time (~20 Myr evenly spaced time intervals) to represent key time periods. For each continental configuration corresponding to a given age in Earth history, we generate idealized 2D (but zonally averaged) wind speed and wind stress, and 1D zonally averaged albedo forcing fields45 required by the cGENIE model using the ‘muffingen’ open-source software (see the ‘Code availability’ section). For each palaeogeographical reconstruction, the climatic forcing (that is, solar irradiance and carbon dioxide concentration) is adapted to match the corresponding geological time interval. The partial pressure of CO2 is taken from the recent update of the GEOCARB model46. Solar luminosity is calculated using the model of stellar physics of ref. 47. We impose modern-day orbital parameters (obliquity, eccentricity and precession). The simulations are initialized with a sea-ice-free ocean, homogeneous oceanic temperature (5 °C) and salinity (34.9‰). As variations in the oceanic concentration of bio-available phosphate remain challenging to reconstruct in the geological past48,49, we impose a present-day mean ocean phosphate concentration (2.159 μmol kg−1) in our baseline simulations. We quantify the impact of this uncertainty on our model results by conducting additional simulations using half and twice the present-day ocean phosphate concentration (Extended Data Fig. 3g–i). For each ocean phosphate scenario (that is, 0.5×, 1× and 2× the present-day value), each of the 30 model simulations is then run for 20,000 years, a duration ensuring that deep-ocean temperature and geochemistry reach equilibrium. For each model simulation, the results of the mean annual values of the last simulated year are used for the analysis. Note that, although cGENIE makes projections of the distribution of dissolved oxygen ([O2]) in the ocean, our diversification model does not currently consider oxygenation to be a limit on diversity. Thus, we assumed a modern atmospheric partial pressure of O2 in all 30 palaeo simulations and did not use the resulting projected [O2] fields.

Regional diversification model

We tested two models of diversification—the logistic model and the exponential model—describing the dynamics of regional diversity over time. In both models, the net diversification rate (ρ), with units of inverse time (Myr−1), varies within a pre-fixed range of values as a function of seawater temperature and food availability. The net diversification rate is then calculated for a given location and time according to the following equation:

$$rho ={rho }_{max }-({rho }_{max }-{rho }_{min })(1-{Q}_{{rm{temp}}}{Q}_{{rm{food}}})$$

(1)

where ρmin and ρmax set the lower and upper net diversification rate limits within which ρ is allowed to vary, and Qtemp and Qfood are non-dimensional limitation terms with values between 0 and 1 that define the dependence of ρ on temperature and food, respectively (Extended Data Table 1).

The model considers a direct relationship between seawater temperature, food supply and the rate of net diversification on the basis of the theoretical control that temperature and food supply exert on the rates of origination and extinction (Supplementary Fig. 1). Temperature rise is expected to accelerate the biochemical kinetics of metabolism50 and shorten the development times of individuals51, leading to higher rates of mutation and origination. Greater food availability increases population sizes, which increases the rates of mutation and reduces the probability of extinction52. Furthermore, a large body of observations shows the existence of a positive relationship between resource availability (that is, food supply) and the standing stock of species in marine and terrestrial communities53,54. A larger food supply would support a greater number of individuals. A greater diversity of food resources could also lead to a finer partitioning of available resources55.

The temperature dependence of ρ is calculated using the following equation:

$${Q}_{{rm{temp}}}=frac{{Q}_{10}^{frac{T-Tmin }{10}}}{{Q}_{10}^{frac{Tmax -Tmin }{10}}}$$

(2)

where the Q10 coefficient measures the temperature sensitivity of the origination rate. In equation (2) above, T is the seawater temperature (in °C) at a given location and time, and Tmin and Tmax are the 0.01 percentile and the 0.99 percentile, respectively, of the temperature frequency distribution in each time interval. In the model, the values of Tmin and Tmax used to calculate Qtemp are therefore recomputed for every time interval (~5 Myr) according to the temperature frequency distribution of the corresponding time interval. This enables us to use updated Tmin and Tmax values in each Phanerozoic time interval and to account for the thermal adaptation of organisms to ever changing climate conditions.

The food limitation term is parameterized using a Michaelis–Menten formulation as follows:

$${Q}_{{rm{food}}}=frac{text{POC flux}}{left({K}_{{rm{food}}},+,text{POC flux}right)}$$

(3)

where POC flux (mol m−2 yr−1) is the particulate organic carbon export flux, which is used as a surrogate for food availability, at a given location and time of the simulated seafloor. The parameter Kfood (mol m−2 yr−1) in equation (3) is the half-saturation constant, that is, the POC flux at which the diversification rate is half its maximum value, provided that other factors were not limiting. These temperature and food supply limitation terms vary in space and time as a result of changes in seawater temperature and particulate organic carbon export rate, respectively, thereby controlling the spatial and temporal variability of ρ (Supplementary Video 6).

The net diversification rate becomes negative (1) in the event of mass extinctions or (2) in response to regional-scale processes, such as sea-level fall and/or seafloor deformation along convergent plate boundaries. Mass extinction events are imposed as external perturbations to the diversification model by imputing negative net diversification rates to all active seafloor points (ocean points and flooded continental points) and assuming non-selective extinction. The percentage of diversity loss as well as the starting time and duration of mass extinctions are extracted from three fossil diversity curves of reference20,21,22 (Source Data for Fig. 1). Each of these fossil diversity curves provides different insights into the Phanerozoic history of marine animal diversity based on uncorrected range-through genus richness estimates20,22 and sampling standardized estimates21. Regional-scale processes—such as sea level fall during marine regressions and/or seafloor destruction at plate boundaries, either by subduction or uplift—are simulated by the combined plate tectonic–palaeoelevation model, and constrain the time that seafloor habitats have to accumulate diversity.

The model assumes non-selective extinction during mass extinction events (that is, the field of bullets model of extinction; everything is equally likely to die, no matter the age of the clade and regardless of adaptation)56. However, there is much fossil evidence supporting extinction selectivity57,58. It could be argued that higher extinction rates at diversity hotspots would have delayed their subsequent recovery, flattening global diversity trends. This argument is difficult to reconcile with Sepkoski’s genus-level global diversity curve but could be consistent with the standardized diversity curve of ref. 21. Similarly, the model is also not suitable for reproducing the explosive radiations of certain taxonomic groups after mass extinctions, which could explain the offset between the model and fossil observations in the early Mesozoic (Fig. 1).

Letting D represent regional diversity (number of genera within a given seafloor point) and t represent time, the logistic model is formalized by the following differential equation:

$$frac{partial Dleft(tright)}{partial t}=rho Dleft[1-frac{D}{{K}_{{rm{eff}}}}right]$$

(4)

where D(t) is the number of genera at time t and Keff is the effective carrying capacity or maximum number of genera that a given seafloor point (that is, grid cell area after gridding) can carry at that time, t. In our logistic model, Keff is allowed to vary within a fixed range of values (from Kmin to Kmax) as a positive linear function of the POC flux at a given location and time as follows:

$${K}_{{rm{eff}}}={K}_{max }-left({K}_{max }-{K}_{min }right)frac{{text{POC flux}}_{max }-text{POC flux}}{{text{POC flux}}_{max }-{text{POC flux}}_{min }}$$

(5)

where POC fluxmin and POC fluxmax correspond to the 0.01 and 0.99 quantiles of the POC flux range in the whole Phanerozoic dataset.

In the logistic model, the net diversification rate decreases as regional diversity approaches its Keff. The exponential model is a particular case of the logistic model when Keff approaches infinity and, therefore, neither the origination rate nor the extinction rate depend on the standing diversities. In this scenario, diversity grows in an unlimited manner over time only truncated by the effect of mass extinctions and/or by the dynamics of the seafloor (creation versus destruction). Thus, the exponential model is as follows:

$$frac{partial Dleft(tright)}{partial t}=rho D$$

(6)

where the rate of change of diversity (the time derivative) is proportional to the standing diversity D such that the regional diversity will follow an exponential increase in time at a speed controlled by the temperature- and food-dependent net diversification rate. Even if analytical solutions exist for the steady-state equilibrium of the logistic and exponential functions, we solved the ordinary differential equations (4) and (6) using numerical methods with a time lag of 1 Myr to account for the spatially and temporally varying environmental constraints, seafloor dynamics and mass extinction events.

As the analysis of global fossil diversity curves is unable to discern the causes of diversity loss during mass extinctions, our imputation of negative diversification rates could have overestimated the loss of diversity in those cases in which sea level fall, a factor already accounted for by our model, contributed to mass extinction. This effect was particularly recognizable across the Permian–Triassic mass extinction (Extended Data Fig. 6d–f), and supports previous claims that the decline in the global area of the shallow water shelf exacerbated the severity of the end-Permian mass extinction34.

Model coupling

As stated above, the coupled plate tectonic–palaeoelevation (palaeogeographical) model corresponds to a tracer-based model (a Lagrangian-based approach) that simulates and tracks the spatiotemporal dynamics of ocean and flooded continental points. The diversification models start at time 541 Ma with all active points having a D0 = 1 (one single genus everywhere) and we let points accumulate diversity heterogeneously with time according to seafloor age distributions (for ocean points) and the time that continents have been underwater (for flooded continental points). The ocean points are created at mid-ocean ridges and disappear primarily at subduction zones. Between their origin and demise, the points move following plate tectonic motions and we trace their positions while accumulating diversity. The flooded continental points begin to accumulate diversity from the moment that they are submerged, starting with a D value equal to the nearest neighbour flooded continental point with D > 1, thereby simulating a process of coastal recolonization (or immigration). The diversification process remains active while the seafloor points remain underwater, but it is interrupted, and D set to 0, in those continental points that emerge above sea level. Similarly, seafloor points corresponding to ocean domains disappear in subduction zones, and their diversity is lost. We track the geographical position of the ocean and flooded continental points approximately every 5 Myr, from 541 Ma to the present. Each and every one of the tracked points accumulates diversity over time at a different rate, which is modulated by the environmental history (seawater temperature and food availability) of each point, as described in equations (1)–(3). When a point arrives in an environment with a carrying capacity lower than the diversity it has accumulated through time, we reset the diversity of the point to the value of the carrying capacity, thereby simulating local extinction.

Seawater temperature (T) and food availability (POC flux) are provided by the cGENIE model, which has a spatial and temporal resolution coarser than the palaeogeographical model. The cGENIE model provides average seawater T and POC flux values in a 36 × 36 equal area grid (grid cell area equivalent to 2° latitude by 10° longitude at the equator) and 30 time slices or snapshots (from 541 Ma to the present: each ~20 Myr time intervals). To have environmental inputs for the 82 time slices of the plate tectonic–palaeoelevation model, we first interpolate the cGENIE original model output data on a 0.5° by 0.5° grid to match the annotated grids provided by the plate tectonic–palaeoelevation model. As the relatively coarse spatial resolution of the cGENIE model prevents rendering the coast–ocean gradients, we assign surface T and POC flux at the base of the euphotic zone to the flooded continental shelf grid cells, and deep ocean T and POC flux at the bottom of the ocean to the ocean grid cells. As there are time slices without input data of seawater T and POC flux, we interpolate/extrapolate seawater T and POC flux values into the 0.5° by 0.5° flooded continental shelf and ocean grids independently. Finally, we interpolate values from these 0.5° by 0.5° flooded continental shelf and ocean grids into the exact point locations in each time frame. Thus, each active point is tracked with its associated time-varying T and POC flux values throughout its lifetime. On average, 6,000 flooded continental points and 44,000 oceanic points were actively accumulating diversity in each time frame. The model cannot simulate the singularities of relatively small enclosed seas for which the spatial resolution of the palaeogeographical and Earth system models is insufficient to capture relevant features (such as palaeobathymetry, seawater temperature) in detail. The method is also likely to underestimate the diversity of epeiric (inland) seas due to the difficulty of simulating immigration, a process that is strongly influenced by the effect of surface ocean currents and is not considered here. However, as stated above, the model considers recolonization of recently submerged areas by marine biota from nearby coastal environments, which partially explains coastal immigration.

Estimation of global diversity from regional diversity

Our regional diversity maps are generated by separately interpolating ocean point diversity and flooded continental point diversity into the 0.5° by 0.5° annotated grids provided by the palaeogeographical model. We calculate global diversity at each time step from each of the regional diversity maps following a series of steps to integrate diversity along line transects from diversity peaks (maxima) to diversity troughs (minima) (Extended Data Fig. 1). To select the transects, first, we identify on each of the regional diversity maps the geographical position of the diversity peaks. We identify local maxima (that is, grid cells with diversity greater than their neighbour cells), and define the peaks as those local maxima with diversity greater than the 0.75 quantile of diversity values in all local maxima in the map. In the case of grid cells with equal neighbour diversity, the peak is assigned to the grid cell in the middle. We subsequently identify the geographical position of the diversity troughs, which are defined as newly formed ocean grid cells (age = 0 Myr) and, therefore, with diversities equal to one. The troughs are mostly located at mid-ocean ridges.

On each of the 82 spatial diversity maps, we trace a line transect from each diversity peak to its closest trough, provided that the transect does not cross land in more than 20% of the grid cells along the linear path (Supplementary Video 7). On average, for each spatial diversity map, we trace 400 (σ = ±75) linear transects. This sampling design gives rise to transects of different lengths, which may bias the estimates of global diversity. To minimize this bias, we cut the tail of the transects to have a length of 555 km equivalent to 5° at the equator. We tested an alternative cut-off threshold, 1,110 km, and the results do not alter the study’s conclusions.

We apply Bresenham’s line algorithm59 to detect the grid cells crossed by the transects and annotate their diversity. To integrate regional diversity along the transects, we developed a method to simplify the scenario of peaks and troughs heterogeneously distributed on the 2D diversity maps. The method requires (1) a vector (the transect) of genus richness (αn) at n different locations (grids) arranged in a line (1D) of L grids, and (2) a coefficient of similarity (Vn,n + 1) between each two neighbouring locations, n and n + 1. Vn,n + 1, the coefficient of similarity, follows a decreasing exponential function with distance between locations. The number of shared genera between n and n + 1 is Vn,n + 1 × min(αn; αn + 1). We integrate diversity from peaks to troughs and assume that, along the transect, αn + 1 is lower than αn. We further assume that the genera present in n and n + 2 cannot be absent from n + 1. Using this method, we integrate the transect’s diversity (γi) using the following equation:

$${gamma }_{i}={ {mathbf{upalpha}}}_{1}+{sum }_{n=1}^{L-1}left(1-{V}_{n,n+1}right){ {mathbf{upalpha}}}_{n+1}$$

(7)

To integrate the diversity of all transects (γi) on each 2D diversity map (or time slice), we apply the same procedure as described above (Extended Data Fig. 1). We first sort the transects in descending order from the highest to the lowest diversity. We then assume that the number of shared genera between transect i and the rest of the transects with greater diversity {1, 2, …, i − 1} is given by the distance of its peak to the nearest neighbour peak (NN(i)) of those already integrated {1, 2, …, i − 1}. Thus, we perform a zigzag integration of transects’ diversities down gradient, from the greatest to the poorest, weighted by the nearest neighbour distance among the peaks already integrated. As a result, the contribution of each transect to global diversity will depend on its diversity and its distance to the closest transect out of all those transects already integrated. Using this method, we linearize the problem to simplify the cumbersome procedure of passing from a 2D regional diversity map to a global diversity estimate without knowing the identity (taxonomic affiliation) of the genera. If γtotal is the global diversity at time t:

$${gamma }_{{rm{total}}}={gamma }_{1}+{sum }_{i=2}^{j}left(1-{V}_{{rm{NN}}left(iright),i}right){gamma }_{i}$$

(8)

Finally, the resulting global estimates are plotted against the midpoint value of the corresponding time interval to generate a synthetic global diversity curve. To compare the global diversity curves produced by the diversification models with those composed from the fossil record, Lin’s CCC60 is applied to the data normalized to the min–max values of each time series (that is, rescaled within the range 0–1). Lin’s CCC combines measures of both precision and accuracy to determine how far the observed data deviate from the line of perfect concordance (that is, the 1:1 line). Lin’s CCC increases in value as a function of the nearness of the data’s reduced major axis to the line of perfect concordance (the accuracy of the data) and of the tightness of the data around its reduced major axis (the precision of the data).

The time series of global diversity generated from the fossil record and from the diversification model exhibit serial correlation and the resulting CCCs are therefore inflated. The use of methods for analysing non-zero autocorrelation time series data, such as first differencing or generalised least squares regression, enables high-frequency variations along the time series to be taken into account. However, the relative simplicity of our model, which was designed to reproduce the main Phanerozoic trends in global diversity, coupled with the fact that biases in the fossil data would introduce uncertainty into the analysis, leads us to focus our analysis on the long-term trends, obviating the effect of autocorrelation.

Model parameterization and calibration

The diversification models are parameterized assuming a range of values that constrain the lower and upper limits of the genus-level net diversification rate (ρmin and ρmax, respectively) (Extended Data Table 1) according to previously reported estimates from fossil records (figures 8 and 11 of ref. 5). A range of realistic values is assigned for the parameters Q10 and Kfood, determining, respectively, the thermal sensitivity and food dependence of the net diversification rate. We test a total of 40 different parameter combinations (Extended Data Table 2). The resulting estimates of diversity are then compared against the fossil diversity curves of ref. 20, ref. 21 or ref. 22, and the 15 parameter combinations providing the highest CCCs are selected.

The results of the logistic diversification model rely on the values of the minimum and maximum carrying capacities (Kmin and Kmax, respectively) within which the spatially resolved effective carrying capacities (Keff) are allowed to vary. The values of Kmin and Kmax are therefore calibrated by running 28 simulations of pair-wise Kmin and Kmax combinations increasing in a geometric sequence of base 2, from 2 to 256 genera (Extended Data Fig. 4). We perform these simulations independently for each of the 15 parameter settings selected previously (Extended Data Fig. 4 and Extended Data Table 2). Each combination of Kmin and Kmax produces a global diversity curve, which is evaluated as described above using Lin’s CCC.

Calculating estimates of global diversity from regional diversity maps in the absence of information on genus-level taxonomic identities requires that we assume a spatial turnover of taxa with geographical distance (Extended Data Fig. 1). Distance-decay curves are routinely fitted by calculating the ecological similarity (for example, the Jaccard similarity index) between each pair of sampling sites, and fitting an exponential decay function to the points on a scatter plot of similarity (y axis) versus distance (x axis). Following this method, we fit an exponential decay function to the distance–decay curves reported in ref. 61, depicting the decrease in the Jaccard similarity index (J) of fossil genera with geographical distance (great circle distance) at different Phanerozoic time intervals:

$$J={J}_{{rm{o}}{rm{f}}{rm{f}}}+(,{J}_{max}text{-}{J}_{{rm{o}}{rm{f}}{rm{f}}}){{rm{e}}}^{-lambda times {rm{d}}{rm{i}}{rm{s}}{rm{t}}{rm{a}}{rm{n}}{rm{c}}{rm{e}}}$$

(9)

where Joff = 0.06 (n.d.) is a small offset, Jmax = 1.0 (n.d.) is the maximum value of the genus-based Jaccard similarity index and λ = 0.0024 (km−1) is the distance-decay rate.

The Jaccard similarity index (J) between consecutive points n and n + 1 is bounded between 0 and min(αn; αn + 1)/max(αn; αn + 1). A larger value for J would mean that there are more shared genera between the two communities than there are genera within the least diverse community, which is ecologically absurd. However, using a single similarity decay function can lead the computed value of J to be locally larger than min(αn; αn + 1)/max(αn; αn + 1). To prevent this artefact, we use the Simpson similarity index or ‘overlap coefficient’ (V) instead of J. V corresponds to the percentage of shared genera with respect to the least diverse community (min(αn ; αn + 1)). V is bounded between 0 and 1, whatever the ratio of diversities. As the pre-existing estimates of similarity are expressed using J (ref. 61), we perform the conversion from J to V using the algebraic expression V = (1 + R) × J/(1 + J) where R = max(αn; αn + 1)/min(αn; αn + 1) (Supplementary Note 1). In the cases in which J exceeds the min(αn; αn + 1)/max(αn; αn + 1), V becomes >1 and, in those cases, we force V to be <1 by assuming R = 1, that is αn = αn + 1.

The model considers a single distance-decay function for the spatial turnover of taxonomic composition. However, the degree of provinciality (that is, the partitioning of life into distinct biogeographical units) varies in space and time as a result of environmental gradients62 and plate tectonics63. In fact, the increase in provinciality has been invoked as the main driver of the increase in global diversity, especially in the Late Cretaceous and Cenozoic22,62,63. This is a deficiency of the model. Unfortunately, information on the extent to which marine provinciality has varied in space and time throughout the Phanerozoic is limited61,62, and there is no simple (mechanistic) way to implement different distance-decay functions of taxonomic similarity in the model. There is a clear difference between longitudinal and latitudinal distance, the latter being a more significant source of taxonomic turnover62. This effect would add to the observation that tropical diversity hotspots became more prominent towards the end of the Phanerozoic, offering two complementary explanations for the increase in diversity in the Mesozoic: (1) favourable conditions for the development of diversity hotspots and (2) a higher degree of provinciality.

Fossil data

We used three fossil diversity curves of reference20,21,22 to (1) extract the patterns of mass extinctions (starting time, duration and magnitude) imposed on the model, and (2) compare the global diversity curves produced by the model with those generated from fossil data. Sepkoski’s global diversity curve corresponds to marine invertebrates listed in Sepkoski’s published marine genus compendium20 (data downloaded from Sepkoski’s Online Genus Database; http://strata.geology.wisc.edu/jack/). The global diversity curves of refs. 21,22 are digitized from the original sources, that is, figure 3 of ref. 21 and figure 2a of ref. 22, respectively, using the free software XYscan. The curve reported by ref. 21 corresponds to genus-richness estimates obtained after correcting for sampling effort using the shareholder quorum subsampling technique. This curve is binned at approximately 11 Myr time intervals and includes non-tetrapod marine animals of which Anthozoa, Trilobita, Ostracoda, Linguliformea, Articulata, Bryozoa, Crinoidea, Echinoidea, Graptolithina, Conodonta, Chondrichthyes, Cephalopoda, Gastropoda and Bivalvia are the major taxonomic groups. The curve reported in ref. 22 corresponds to 1 Myr range-through richness estimates of marine skeletonized invertebrate genera including Brachiopoda, Bivalvia, Anthozoa, Trilobita, Gastropoda, Crinoidea, Blastoidea, Edrioasteroidea, Ammonoidea, Nautiloidea and Bryozoa. All digitized (and interpolated) diversity data and the net diversification rate data imputed by the model to simulate mass extinctions are provided as Source Data for Fig. 1.

The choice to analyse the data for all animals is somewhat in conflict with the fact that, in the open ocean, photosynthetic primary production and the flux of organic matter at the bottom of the water column are decoupled. This decoupling leads to contrasting differences in the amount of food available to the planktonic/nektonic and benthic communities yet, in the model, we assume that the amount of organic carbon reaching the seafloor is a proxy for food supply. Given that most of the diversity is concentrated in shallow shelf seas, this assumption is likely to be of only relatively minor importance in a global context.

OBIS data

We use the occurrence records of genera belonging to two of the most diverse marine invertebrate groups: subphylum Crustacea and phylum Mollusca, as downloaded from the Ocean Biodiversity Information System (OBIS) on 22 October 2021 (www.obis.org). The list of genera is validated with the genera names in WoRMS (https://www.marinespecies.org) and only the accepted, extant and marine names were selected for the analysis. This corresponds to a total of 10,018,142 records of 9,750 genera (6,540,489 records and 5,533 genera of crustaceans and 3,477,653 records and 4,217 genera of molluscs) collected from 1920 to 2021. The records are gridded into hexagons (800,000 km2 at the equator) to account for different gamma (regional) diversity across latitudes, otherwise, a bias would occur in the resulting estimates. To ensure sufficient sampling size, we selected only those hexagons with more than or equal to 10 occurrence records and with more than three genera. Furthermore, we use the frequencies of the genera to estimate the number of unobserved genera per hexagon. We do so by extrapolating the number of genera on the basis of a bias-corrected Chao estimate according to the tail of rare genera (that is, those genera that have only one or two occurrence records in a hexagon)64,65. The final number of genera per hexagon is the sum of the observed and unobserved estimates of genera. The analysis is performed with the package vegan66 in R v.4.1.2. Finally, we spatially overlap the hexagons and 0.5° × 0.5° square grid to match the map of the palaeo analysis and extract the value of the diversity index per coastal grid in QGIS v.3.22.0. The comparison between model and observations is made on the normalized diversities (0–1) bounded between the 0.05 and 0.95 quantiles to minimize the effect of outliers in the observed pattern. These diversity data are provided as Source Data for Fig. 2.

Testing a static (null) palaeogeographical model

To evaluate the effect of palaeogeography on global diversity dynamics, we carry out simulations for three static palaeogeographical configurations: the Devonian (400 Ma), the Carboniferous (300 Ma) and the present. For each of these three configurations, the model runs for 541 Myr in a ‘static mode’, that is, diversity accumulates steadily at a pace determined by the temperature and food assigned to each grid at the selected static configuration. Mass extinctions are imposed the same way we do in the default model with variable palaeogeography. The test is performed for the exponential diversification model and the calibrated logistic model and for each of the three mass extinction patterns20,21,22. Extended Data Fig. 7a–i shows the differences between the log-transformed normalized diversities (between 0 and 1) produced by the calibrated logistic model with static palaeogeography (nDiv tectonics OFF) and with variable palaeogeography (nDiv tectonics ON). Red and blue colours denote, respectively, the extent to which the static model produces diversity estimates above or below those produced by the model with plate tectonics. Tropical regions are dominated by reddish colours indicating that the static model particularly overestimates diversity in these regions, where high temperatures accelerate diversification. In the absence of plate tectonics, the model leads to a scenario of uncontrolled diversity growth (mainly in the tropical shelf seas; reddish areas on maps) such that even mass extinctions cannot dampen diversity growth (Extended Data Fig. 7j–l) and only effective carrying capacities prevent diversity from running away (Extended Data Fig. 7m–o). A static geographical configuration also prevents diversity hotspots from disappearing at convergent plate boundaries. These results support the idea that Earth’s palaeogeographical evolution and sea level changes, by creating, positioning and destroying seafloor habitats, have had a key role in constraining the growth of diversity throughout the Phanerozoic.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.


Source: Ecology - nature.com

MIT engineers design surfaces that make water boil more efficiently

Comparative efficacy of phosphorous supplements with phosphate solubilizing bacteria for optimizing wheat yield in calcareous soils