Global diversity dynamics in the fossil record are regionally heterogeneous
Spatial standardisation workflowTo produce spatially-standardised fossil occurrence datasets which remain geographically consistent through time, we designed a subsampling algorithm which enforces consistent spatial distribution of occurrence data between time bins, while maximising data retention and permitting highly flexible regionalisation (Fig. 8). Our method was developed in light of, and takes some inspiration from, the spatial standardisation procedure of Close et al.2. This method provides, within a given time bin, subsamples of occurrence data with threshold MST lengths. An average diversity estimate can be taken from this ‘forest’ of MSTs, selecting only those of a target tree length to ensure spatially-standardised measurements. It does not produce a single dataset across time bins, however; rather a series of discontinuous, bin-specific datasets which cannot then simply be concatenated as the spatial extents of each bin-specific forest are not standardised (despite each individual MST being so), even when MSTs are assigned to a specific geographic region, e.g. a continent or to a particular latitudinal band. This prevents estimation of rates, because such analyses require datasets that span multiple time bins and remain geographically consistent and spatially standardised through the time span of interest. This is the shortcoming that our method overcomes. The workflow consists of three main steps.Fig. 8: Component steps of our spatial standardisation workflow.A A spatial window (dotted lines) is used to demarcate the spatial region of interest, which may shift in a regular fashion through time to track that region. Data captured in each window is clipped to a target longitude–latitude range (orange lines). B The data forming the longitude–latitude extent is marked, then masked from further subsampling. C Data are binned using a hexagonal grid, the tally of occurrences in each grid cell taken, and a minimum spanning tree constructed from the grid cell centres. D The cells with the smallest amount of data are iteratively removed from the minimum spanning tree until a target tree length is reached.Full size image1. First, the user demarcates a spatially discrete geographic area (herein the spatial window) and a series of time bins into which fossil occurrence data is subdivided. Occurrence data falling outside the window in each time bin are dropped from the dataset, leaving a spatially restricted subsample (Fig. 8A). Spatial polygon demarcation is a compromise between the spatial availability of data to subsample and the region of interest to the user but allows creation of a dataset where regional nuances of biodiversity may be targeted. Careful choice of window extent can even aid subsequent steps by targeting regions that have a consistently sampled fossil record through time, even if the extent of that record fluctuates. To account for spatially non-random changes in the spatial distribution of occurrence data arising from the interlinked effects of continental drift, preservation potential and habitat distribution17, the spatial polygon may slide to track the location of the available sampling data through time. This drift is performed with two conditions. First, the drift is unidirectional so that the sampling of data remains consistent relative to global geography, rather than allowing the window to hop across the globe solely according to data availability and without biogeographic context. Second, spatial window translation is performed in projected coordinates so that its sampling area remains near constant between time bins, avoiding changes in spatial window area that could induce sampling bias from the species-area effect.2. Next, subsampling routines are applied to the data to standardise its spatial extent to a common threshold across all time bins using two metrics: the length of the MST required to connect the locations of the occurrences; and the longitude–latitude extent of the occurrences. MST length has been shown to measure spatial sampling robustly as it captures not just the absolute extent of the data but also the intervening density of points, and so is highly correlated with multiple other geographic metrics16. MSTs with different aspect ratios may show similar total lengths but could sample over very different spatial extents, inducing a bias by uneven sampling across spatially organised diversity gradients16; standardising longitude–latitude extent accounts for this possibility. The standardisation methods can be applied individually or serially if both MST length and longitude–latitude range show substantial fluctuations through time. Data loss is inevitable during subsampling and may risk degrading the signals of origination, extinction and preservation. To address this issue, subsampling is performed to retain the greatest amount of data possible. During longitude–latitude standardisation, the range containing the greatest amount of data is preserved. During MST standardisation, occurrences are spatially binned using a hexagonal grid to reduce computational burden and to permit assessment of spatial density (Fig. 8B). The grid cells containing the occurrences that define the longitude–latitude extent of the data are first masked from the subsampling procedure so that this property of the dataset is unaffected, and then the occurrences within the grid cells at the tips of the MST are tabulated. Tip cells with the least data are iteratively removed (removal of non-tip cells may have little to no effect on the tree topology) until the target MST length is achieved (Fig. 8D), with tree length iteratively re-calculated to include the branch lengths added by the masked grid cells.For both methods, the solution with the smallest difference to the target is selected and so both metrics may fluctuate around this target from bin to bin, with the degree of fluctuation depending upon the availability of data to exclude—larger regions that capture more data are more amenable to the procedure than smaller regions. Similarly, the serial application of both metrics reduces the pool of data available to the second method, although longitude–latitude standardisation is always applied first in the serial case so that the resultant extent will be retained during MST standardisation. Consequently, the choice of standardisation procedure and thresholds must be tailored to the availability and extent of data within the sampling region through time, along with the resulting degree of data loss. This places further emphasis on the careful construction of the spatial window in the first step. Threshold choice is also a compromise between data loss and consistency of standardisation across the dataset and so it may be necessary to choose targets that standardise spatial extent well for the majority of the temporal range of a dataset, rather than imposing a threshold that spans the entire data range but causes unacceptable data loss in some bins.3. Once the time-binned, geographically restricted data have been spatially standardised, the relationship between diversity and spatial extent is scrutinised. After standardisation, it is expected that residual fluctuations in spatial extent should induce little or no change in apparent diversity. Bias arising from temporal variation in sampling intensity may still be present, so diversity is calculated using coverage-based rarefaction (also referred to as shareholder quorum subsampling13,62,63), with a consistent coverage quorum from bin to bin. While coverage-based rarefaction has known biases, it remains the most accurate non-probabilistic means of estimating fossil diversity14. As such, we consider it the most appropriate method to assess the diversity of a region-level fossil dataset. The residual fluctuations in spatial extent may then be tested for correlation with spatially standardised, temporally corrected diversity. If a significant relationship is found, then the user must go back and alter the standardisation parameters, including the spatial window geometry and drift, the longitude–latitude threshold, and the MST threshold. Otherwise, the dataset is considered suitable for further analysis.We implement our subsample standardisation workflow in R with a custom algorithm, spacetimestand, along with a helper function spacetimewind to aid the initial construction of spatial window. spacetimestand can then accept any fossil occurrence data with temporal constraints in millions of years before present and longitude–latitude coordinates in decimal degrees. Spatial polygon construction and binning is handled using the sp library64, MST manipulation using the igraph and ape libraries65,66, spatial metric calculation using the sp, geosphere and GeoRange libraries67,68, hexagonal gridding using the icosa library69, and diversity calculation by coverage-based rarefaction using the estimateD function from the iNEXT library70. Next, we apply our algorithm to marine fossil occurrence data from the Late Permian to Early Triassic.Data acquisition and cleaningFossil occurrence data for the Late Permian (260 Ma) to Early Jurassic (190 Ma) were downloaded from the PBDB on 28/04/21 with the default major overlap setting applied (an occurrence is treated as within the requested time span if 50% or more of its stratigraphic duration intersects with that time span), in order to minimise edge effects resulting from incomplete sampling of taxon ranges within our study interval of interest (the Permo-Triassic to Triassic-Jurassic boundaries). Other filters in the PBDB API were not applied during data download to minimise the risk of data exclusion. Occurrences from terrestrial facies were excluded, along with plant, terrestrial-freshwater invertebrate and terrestrial tetrapod occurrences (as these may still occur in marine deposits from transport) and occurrences from several minor and poorly represented phyla. Finally, non-genus level occurrences were removed, leaving 104,741 occurrences out of the original 168,124. Based on previous findings2, siliceous occurrences were not removed from the dataset, despite their variable preservation potential compared to calcareous fossils. To increase the temporal precision of the dataset, occurrences with stratigraphic information present were revised to substage- or stage-level precision using a stratigraphic database compiled from the primary literature. To increase the spatial and taxonomic coverage of the dataset, the PBDB data were supplemented by an independently compiled genus-level database of Late Permian to Late Triassic marine fossil occurrences36. Prior to merging, occurrences from the same minor phyla were excluded, along with a small number lacking modern coordinate data, leaving 47,661 occurrences out of an original 51,054. Absolute numerical first appearance and last appearance data (FADs and LADs) were then assigned to the occurrences from their first and last stratigraphic intervals, based on the ages given in A Geologic Timescale 202071. Palaeocoordinates were calculated from the occurrence modern-day coordinates and midpoint ages using the Getech plate rotation model. Finally, occurrences with a temporal uncertainty greater than 10 million years and occurrences for which palaeocoordinate reconstruction was not possible were removed from the composite dataset, leaving 145,701 occurrences out of the original 152,402.In the total dataset, we note that the age uncertainty for occurrences is typically well below their parent stage duration, aside for the Wuchiapingian and Rhaetian where the mean and quartile ages are effectively the same as the stage length (Fig. S44). This highlights the chronostratigraphic quality of our composite dataset, particularly for the Norian stage (~18-million-year duration) which has traditionally been an extremely coarse and poorly resolved interval in Triassic-aged macroevolutionary analyses. Taxonomically, most occurrences are molluscs (Fig. 8), which is unsurprising given the abundance of ammonites, gastropods and bivalves in the PBDB, but introduces the caveat that downstream results will be driven primarily by these clades. Foraminiferal and radiolarian occurrences together comprise the next most abundant element of the composite dataset, demonstrating that we nonetheless achieve good coverage of both the macrofossil and microfossil records, along with broad taxonomic coverage in the former despite the preponderance of molluscs.Spatiotemporal standardisationWe chose a largely stage-level binning scheme when applying our spatial standardisation procedure for several reasons. First, the volume of data in each bin is greater than in a substage bin, providing a more stable view of occurrence distributions through time and increasing the availability of data for subsampling. Spatial variation at substage level might still affect the sampling of diversity, but the main goal of this study is to analyse origination and extinction rates where taxonomic ranges are key rather than pointwise taxonomic observations. Consequently, substage level variation in taxon presences likely amounts to noise when examining taxonomic ranges, making stage-level bins preferable in order maximise signal.During exploratory standardisation trials, we found a large crash in diversity and spatial sampling extent during the Hettangian (201.3–199.3 Mya). No significant relationships with spatially-standardised diversity were found when the Hettangian bin was excluded from correlation tests, indicating its disproportionate effect in otherwise well-standardised time series. Standardising the data to the level present in the Hettangian would have resulted in unacceptable data loss so we instead accounted for this issue by merging the Hettangian bin with the succeeding Sinemurian bin, where sampling returns to spatial extents consistent with older intervals. While this highlights a limitation of our method, as the Hettangian is 2: positive support, log BF > 6: strong support)85 using the -plotRJ function of PyRate.Probabilistic diversity estimationTraditional methods of estimating diversity do not directly address uneven sampling arising from variation in preservation, collection and description rates, and their effectiveness is highly dependent on the structure of the dataset. We present an alternative method to infer corrected diversity trajectories based on the sampled occurrences and on the preservation rates through time and across lineages as inferred by PyRate, which we term mcmcDivE. The method implements a hierarchical Bayesian model to estimate corrected diversity across arbitrarily defined time bins. The method estimates two classes of parameters: the number of unobserved species for each time bin and a parameter quantifying the volatility of the diversity trajectory.We assume the sampled number of taxa (i.e. the number of fossil taxa, here indicated with xt) in a time bin to be a random subset of an unknown total taxon pool, which we indicate with Dt. The goal of mcmcDivE is to estimate the true diversity trajectory ({{{{{bf{D}}}}}},=,left{{D}_{1},{D}_{2},ldots ,{D}_{t}right}), of which the vector of sampled diversity ({{{{{bf{x}}}}}},=,{{x}_{1},{x}_{2},ldots ,{x}_{t}}) is a subset. The sampled diversity is modelled as a random sample from a binomial distribution86 with sampling probability pt:$${x}_{t}, ,sim, {{{{{rm{Bin}}}}}}({D}_{t},{p}_{t})$$
(1)
We obtain the sampling probability from the preservation rate (qt) estimated in the initial PyRate analysis. If the PyRate model assumes no variation across lineages the sampling probability based on a Poisson process is ({p}_{t},=,1,-,{{{{{rm{exp }}}}}}({-q}_{t},times, {delta }_{t})), where δt is the duration of the time bin. When using a Gamma model in PyRate, however, the qt parameter represents the mean rate across lineages at time t and the rate is heterogeneous across lineages based on a gamma distribution with shape and rate parameters equal to an estimated value α.To account for rate heterogeneity across lineages in mcmcDivE, we draw an arbitrarily large vector of gamma-distributed rate multipliers g1, …, gR ~ Γ(α,α) and compute the mean probability of sampling in a time bin as:$${p}_{t},=,frac{1}{R}mathop{sum }limits_{i,=,R}^{R}1,-,{{{{{rm{exp }}}}}}(-{q}_{t},,times, {g}_{i},times, {delta }_{t})$$
(2)
We note that while qt quantifies the mean preservation rate in PyRate (i.e. averaged among taxa in a time bin t), the mean sampling probability pt will be lower than (1,-,{{{{{rm{exp }}}}}}({-q}_{t},times, {delta }_{t})) (i.e. the probability expected under a constant preservation rate equal to qt) especially for high levels of rate heterogeneity, due to the asymmetry of the gamma distribution and the non-linear relationship between rates and probabilities. We sample the corrected diversity from its posterior through MCMC. The likelihood of the sampled number of taxa is computed as the probability mass function of a binomial distribution with Di as the ‘number of trials’ and pi as the ‘success probability’. To account for the expected temporal autocorrelation of a diversity trajectory87, we use a Brownian process as a prior on the log-transformed diversity trajectory through time. Under this model, the prior probability of Dt is:$$Pleft({{{{{rm{log }}}}}}left({D}_{t}right)right),{{{{{mathscr{ sim }}}}}},{{{{{mathscr{N}}}}}}({{{{{rm{log }}}}}}left({D}_{t,-,1}right),,sqrt{{sigma }^{2},,times, ,{delta }_{t}})$$
(3)
where σ2 is the variance of the Brownian process. For the first time bin in the series, Dt = 0, we use a vague prior ({{{{{mathscr{U}}}}}}(0,infty )). Because the variance of the process is itself unknown and may vary among clades as a function of their diversification history, we assign it an exponential hyperprior Exp(1) and estimate it using MCMC. Thus, the full posterior of the mcmcDivE model is:$$underbrace{P(D,{sigma }^{2}|x,q,alpha )}_{{{{{rm{posterior}}}}}}propto underbrace{P(x|D,q,alpha )}_{{{{{rm{likelihood}}}}}}times underbrace{P(D|{sigma }^{2})}_{{{{{rm{prior}}}}}}times underbrace{P({sigma }^{2})}_{{{{{rm{hyperprior}}}}}}$$
(4)
where ({{{{{bf{D}}}}}},=,{{D}_{0},{D}_{1},ldots ,{D}_{t}}) and ({{{{{bf{q}}}}}},=,{{q}_{0},{q}_{1},ldots ,{q}_{t}}) are vectors of estimated diversity, sampled diversity, and preservation rates for each of T time bins. We estimate the parameters D and σ2 using MCMC to obtain samples from their joint posterior distribution. To incorporate uncertainties in q and α we randomly resample them during the MCMC from their posterior distributions obtained from PyRate analyses of the fossil occurrence data. While in mcmcDivE we use a posterior sample of qt and α precomputed in PyRate for computational tractability of the problem, a joint estimation of all PyRate and mcmcDivE parameters is in principle possible, particularly for smaller datasets. mcmcDivE is implemented in Python v.3 and is available as part of the PyRate software package.Simulated and empirical diversity analysesWe assessed the performance of the mcmcDivE method using 600 simulated datasets obtained under different birth-death processes and preservation scenarios. The settings of the six simulations (A–F) are summarised in Table S65 and we simulated 100 datasets from each setting. Since the birth-death process is stochastic and can generate a wide range of outcomes, we only accepted simulations with 100 to 500 species, although the resulting number of sampled species decreased after simulating the preservation process. From each birth-death simulation we sampled fossil occurrences based on a heterogeneous preservation process. Each simulation included six different preservation rates which were drawn randomly within the boundaries 0.25 and 2.5, with rate shifts set to 23, 15, 8, 5.3 and 2.6 Ma. To ensure that most rates were small (i.e. reflecting poor sampling), we randomly sampled preservation rates as:$$q, sim ,exp left({{{{{mathscr{U}}}}}}left(log left(0.25right),,log left(2.5right)right)right)$$
(5)
In two of the five scenarios (D, F), we included strong rate heterogeneity across lineages (additionally to the rate variation through time), by assuming that preservation rates followed a gamma distribution with shape and rate parameters set to 0.5. This indicates that if the mean preservation rate in a time bin was 1, the preservation rate varied across lineages between More
