in

Stochastic models support rapid peopling of Late Pleistocene Sahul

Cellular-automaton framework

We constructed the cellular-automaton model in the open-access R statistical computing environment (cran.rproject.org). We provide all code, data and instructions to repeat the analysis65, which can be run on any desktop computer. Our spatial model is based on a 0.5° × 0.5° raster grid of Sahul from 0.5 to 43.0° S latitude, and 110.5 to 153.5° E longitude (86 rows and 87 columns). The land area of Sahul changes with fluctuating sea levels, so we estimated exposed land in 1000-year time slices to follow our available hindcasts of carrying capacity (see ‘Carrying capacity’ below) based on a digital elevation model and estimated sea-level change over the period of interest (from 85 to 40 ka; see Scenarios). We used the ETOPO1 global relief model of Earth’s surface66 to estimate the exposed landmass of Sahul through time. To reconstruct the landmass changes of Sahul every 1000 years, we applied sea-level variability outputs67 to the ETOPO1 model. We also included fluctuations in Lake Carpentaria that could potentially act as a natural barrier for human movement over time. We modified the contour of the lake based on modelled sea-level changes68 applied to the digital elevation model.

From an initially peopled cell (see Scenarios), the new population can grow following a Ricker population-dynamics model, and emigrate to adjacent cells following stochastically resampled rules of dispersal; likewise, each cell can receive immigrants from adjacent cells following similar dispersal rules (see ‘Emigration and immigration’, and ‘Long-distance dispersal’).

Population-dynamics model

Each cell within the grid acts as a particular sub-population unit within the overall dynamics of Sahul, and the summary information provided at the end of a simulation is an overall expression of all cells. The change in human abundance (N) within each cell is governed by the following phenomenological (Ricker) equation of population dynamics:

$$N_{i,j,t + 1} = N_{i,j,t}e^{r_mleft( {1 – frac{{N_{i,j,t}}}{{K_{i,j,t}}}} right)} – left( {E_{i,j,t} – I_{i,j,t}} right)$$

(1)

where i is the cell row number in the 0.5° × 0.5° latitude lattice, j is the cell column number, t is the time interval in units of human generations (1 g = 27.9 years)2, Ni,j,t + 1 is the number of individuals in cell i, j at the next time interval (t + 1), Ni, j, t is the number of individuals in cell i, j at time interval t, rm is the maximum rate of population increase when resources are not limiting, Ki, j, t is the cell-specific carrying capacity (see ‘Carrying capacity’ below), and the Ei, j, t and Ii, j, t parameters represent the number of individuals emigrating from and immigrating into the focal cell i, j per time interval t, respectively (see ‘Emigration and immigration’). As an estimate of rm, we set the age-structured Leslie matrix for Aboriginal hunter–gatherers2 to have a survival probability (subdiagonal matrix entries) all equal to 1 (complete survival in every age class), and then took the loge of that matrix’s dominant eigenvalue to the power of g multiplied by 2 as the generationally scaled rm estimate required for Eq. (1). Finally, we imposed a beta-resampled additional mortality parameter MMVP = 0.2 for cells with a population size <100 individuals. This threshold of NMVP = 100 is derived from the ecological concept of minimum viable population size69 where ~100 effective individuals are required to avoid inbreeding depression70. This assumes that individual fitness declines when populations decline below the minimum size due to inbreeding depression from non-random mating and other Allee effects related to social disruption.

Carrying capacity

We constructed a theoretical carrying capacity (K) for each cell from a hindcasted estimate of net primary production based on the LOVECLIM climate reconstruction2,3,35. In this case, we define carrying capacity as the upper limit to total abundance71, which is expressed mathematically as the point at which net population growth becomes stable (r = 0)72,73. LOVECLIM is a three-dimensional Earth-system model of intermediate complexity35 that includes representations of the atmosphere, ocean and sea ice, land surface (including vegetation), ice sheets and the carbon cycle. LOVECLIM produces climates over the past 800 ka in 1000-year time-averaged increments (layers) that we downscaled (using a bilinear interpolation)74,75 from a spatial resolution of 5.625° × 5.625° to 0.5° × 0.5°. For each grid cell and each 1000-year layer, we extracted net primary production76 (kg C m−2 year-1) as the comprehensive indicator of relative carrying capacity through time2. To translate net primary production into a carrying capacity expressed in units of humans the landscape was capable of supporting, we used the predicted relationship between net primary production and human density for hunter–gatherer societies37. Here, we rescaled the net primary production values over all grid cells to concur with the minimum (0.018 km−2) and maximum (1.152 km−2) human densities, multiplied by cell area (3080.25 km2) to give per-cell K. We then linearly interpolated these values between the 1000-year layers for each cell.

This approach assumes a linear relationship between K and net primary production (Supplementary Fig. 2). However, others have suggested a non-linear relationship where maximum K occurs at mid-range net primary production, because past humans possibly struck a compromise between high productivity and ease of passage and/or visibility to hunt prey by tending towards ecotones of mid-range productivity56,57; this type of non-linear relationship has also been identified between herbivore biomass and mean annual rainfall across sub-Saharan Africa77. We therefore also applied a 180°-rotated parabola model between K and net primary production (Pp) of the form:

$$K = a(P_p – h)^2 + K_{{mathrm{max}}}$$

(2)

where we set Kmax for the rotated parabolic relationship at 0.5Kmax for the linear relationship, a = −3, h = the median of Pp, and where we scaled the rotated parabolic ΣKmax across all cells and temporal layers to equal ΣKmax of the linear relationship (because of a higher frequency of cells with mid-range compared to high net primary production) (Supplementary Fig. 2). This approach assumes a logarithmic growth to and decline from a peak, so we also considered a third non-linear relationship—a reciprocal quadratic yield density—between K and Pp of the form:

$$K = frac{{P_p}}{{a + bP + cP_p^2}}$$

(3)

where a = 200, b = 0.6 and c = 0.2, which instead assumes a logistic growth to and decline from a more pronounced peak (Supplementary Fig. 2). Whether we invoked the linear, rotated parabolic or reciprocal quadratic yield relationship (see Scenarios), we Poisson-resampled the resultant K for each cell per generational time step to simulate spatio-temporal uncertainty in K.

Emigration and immigration

We reasoned that emigration out of a focal cell i, j to its eight immediately neighbouring cells would be a function of the gradient in Kt between two cells44, as well as how close the abundance Ni, j, t of cell i, j was to Ki, j, t. If the Poisson-resampled Kt ratio between two adjacent cells (Krel = Ki, j, t/Ki + y, j + x, t, where x and y are integers ranging from −1 to 1 to define the immediate one-cell neighbourhood of the focal cell i, j) was <1, and Ni, j, t/Ki, j, t was > a random uniform proportion between 0.3 and 0.7, then emigration ensued. The values of 0.3 and 0.7 originate from Birdsell78 who stated that group ‘budding off’ or ‘fissioning’ occurred when a population reached 30–70% of carrying capacity. We further imposed an exponential decay function to describe the declining probability of emigration (Pr(E)) as the ratio of Ki, j, t/Ki + y, j + x, t = Krel increased towards 1 (Supplementary Fig. 6), where:

$$Pr left( E right) = e^{ – 3.2K_{{mathrm{rel}}}}$$

(4)

For the size of the emigrating group Nmig, we assumed that this represented a beta-resampled proportion Pmig = 1/3, such that Nmig is centred on Ni, j, t × Pmig (with SD = 0.05Ni, j, t × Pmig) based on the observation that when coastal populations of Aboriginal Australians fission, the population tends to separate into one larger and one smaller group, with the latter emigrating79. We applied this calculation for each of the eight cell comparisons, removing those already emigrated from each cell’s subsequent estimate (moving in sequence through directions NW, N, NE, W, E, SW, S and SE of focal cell i, j). Likewise, when Ki, j, t/Ki + y, j + x,t > 1, immigration into cell i, j occurred following the same movement rules as for emigration.

Long-distance dispersal

We used the allometric relationship of natal dispersal for omnivorous and herbivorous mammals40 to predict a dispersal probability for humans. Assuming a mean adult mass of M = 50 kg, maximum natal dispersal distance Dm is estimated as aMb, where a = 3.31 ± 1.17 and b = 0.65 ± 0.05 for omnivores and herbivores combined40. This produced an estimated maximum dispersal distance Dm ranging from 22.4 to 69.3 km. As a maximum dispersal range, this compares well to the average mobility of African hunter–gatherers of 1400–3900 km2/generation39 (equivalent to a radius of 21.1–35.2 km assuming a perfect circle), and the 0.4–1.1 km yr−1 (11.2–30.7 km/generation) estimates for Palaeolithic human expansions in northern Europe41. Also, Gould38 reported journeys by Aboriginal Australians of 400 to 560 km as ‘not unusual’ and perhaps the greatest mobility ever recorded, moving as many as nine times in three months, and covering an area of ~2600 km2 (radius = 28.8 km).

Next, we used the estimated probability of dispersal (Pr(dmax)) of d exceeding multiples (1 to 10) of one cell width (0.5 × 111.12 = 55.6 km) as (Pr left( {d_{{mathrm{max}}}} right) = e^{ – d/aM^b}) (Supplementary Fig. 7). However, there is evidence globally that the territory size of hunter–gatherer groups is strongly related to local productivity, with a greater need to expand foraging areas as productivity declines19,80. Using territory size and rainfall data from Hiscock80, we assumed the same relative change in rainfall applied to net primary productivity, but shifted the power–law relationship upwards to match the slope of the upper limit of maximum dispersal distance (Supplementary Fig. 7b). Thus, for every tenfold decrease in relative net primary production, maximum dispersal distance increases by 12.7 times (Supplementary Fig. 7b). Once a long-distance dispersal event occurred, we Poisson-resampled the maximum dispersal distance to provide a δx and a δy to move from the focal cell in cell units (including a random direction: east–west for δx, and north–south for δy). The size of the long-distance-dispersing population followed the same rules as for neighbouring-cell emigration.

Distance-to-water limitation

While territory size, and hence, maximum dispersal distances increase with increasing aridity according to the relationships described above, there is evidence that human dispersal is ultimately limited by water availability15. This is likely to be even more relevant in Australia, the driest inhabited continent on Earth—indeed, estimated routes of gene flow among Aboriginal Australians suggest that the arid interior acted as a barrier to migration11. We therefore invoked an additional limitation on dispersal by calculating a probability of realizing a long-distance dispersal event (Pl) according to the following equation previously designed to limit modelled species migrations81:

$$P_l = 1 – left( {frac{{D_l}}{{D_{H_2O}}}} right)^{Omega}$$

(5)

where Dl = the realized maximum dispersal distance generated from the algorithm described above, (D_{H_2O}) = the distance to water in units of map cells derived from the Australian Water Observations from Space dataset15, and Ω = the hydrological resistance parameter set arbitrarily to a value of 3 to invoke landscape-scale resistance to movement only in the driest areas of Sahul per generational time step.

Ruggedness

We hypothesized that high landscape ruggedness (elevational gradient) might at least partially impede the progress of human expansion across the landscape42, so we tested this using data available in an ethnographic and environmental dataset compiled by Binford42. Available in the binford library82 in R, the dataset includes >200 variables measuring aspects of hunter–gatherer subsistence, mobility and social organization for 339 ethnographically documented groups. Given the evidence that mobility is a function of productivity36,80, we constructed a simple linear model of annual movement varying with annual rainfall and the difference between maximum and minimum elevation within a 25- (40.2 km) mile radius of the group’s centroid (equivalent to an elevational gradient; i.e., ruggedness). Taking the cube root of annual movement and the difference in maximum and minimum elevation to comply with the assumption of Gaussian error distributions, the expected relationship between movement and rainfall prevailed, and there was a weak effect of elevational difference—a maximum of 1% reduction in annual movement (Supplementary Fig. 8). Expressed on the linear scale and standardizing annual movement and elevational difference to the range of 0–1 (assuming a constant median annual rainfall value), an exponential decay function of the form:

$$M_{{mathrm{red}}} = a + broot {3} of {{G_{{mathrm{rel}}}}}$$

(6)

where Mred = the proportion of expected total annual movement, a = 1.001116, b = −0.0104453 and Grel = the standardized ruggedness from 0 to 1, described the reduction in annual movement rates up to a maximum of 1% (Supplementary Fig. 8). For all instances of emigration, immigration and long-distance dispersal, we assigned this function to the total number of people migrating for each cell based on its standardized ruggedness. We computed the topographic ruggedness index83 as the difference in elevation between a given cell and its eight neighbouring central cells, based on our digital elevation model. For a given cell, we then squared each of the eight elevation difference values (to render them positive), and calculated the square root of the averages of the squares. We updated the spatial resolution of our results to 0.5° × 0.5° to match the other environmental layers.

Catastrophic mortality events

Palaeo-demographic investigations of past human populations suggest that long-term population growth rates were just slightly higher than zero as a result of episodes of catastrophic mortality arising from pandemics, natural disasters and violent conflicts occurring every few generations84. This also agrees well with estimates of the probability of mass mortality events scaling to generation time for vertebrates (Pr(catastrophe) = 0.14 per generation)43. We thus sampled binomially at Pr = 0.14 for whether a catastrophe occurred in each focal cell, and then beta-sampled the severity of the event centred on Mcat = 0.5 (SD = 0.5/10) to emulate a stochastic catastrophe event of 50% mortality, on average, for that cell43.

However, we reasoned that a random allocation of catastrophes among cells across the entirety of Sahul was not realistic, for the reason that mortality events arising from natural disasters, warfare or disease outbreaks would likely be spatially aggregated. We therefore imposed a Thomas cluster process using the rThomas function from the spatstat R library85, setting the intensity of the Poisson process of cluster centres κ to a linear relationship between the number of cells occupied per iteration and a vector ranging from 0.3 to 1.2, the standard deviation of random displacement along each coordinate axis of the grid of a given cell away from the cluster centre σscale = 0.015, and the mean number of cells per cluster μ = 0.6 × the mean dimension of the occupied grid per iteration. This combination of parameters led to a reasonable degree of spatial clustering while maintaining a random spread of cells around a catastrophe focal point, as well as maintaining the overall proportion of cells across the landscape experiencing a catastrophic mortality event ~0.14 per generational iteration.

Scenarios

We ran 120 scenarios (8 entry times, ×5 entry sequences, ×3 relationships between carrying capacity and net primary production) where we modified three main components of the stochastic simulations: (i) the timing of first entry to Australia (from 85 to 50 ka, in 5000-year increments), (ii) the place of entry (northern, southern, simultaneous northern and southern, northern followed by southern 2000 years later, or southern followed by northern 2000 years later) and (iii) the form of the relationship between hindcasted net primary productivity and human carrying capacity (linear, rotated parabolic or reciprocal quadratic yield density). We repeated each scenario 100 times to generate a per-cell confidence interval of time of first arrival. Here, we deemed a cell to have been populated for the first time once it received ≥100 individuals (Nfirst), which is considered the minimum viable effective population size to avoid inbreeding depression70.

Comparison layers

To test the resultant outputs against real archaeological data, we compiled a conservative list of ages older than 30 ka obtained from across Sahul (see ‘Compiling reference archaeological dates’ in the Supplementary Information and Supplementary Data 1). However, the spatial coverage of these ages is highly uneven (Fig. 1a), so we applied a maximum-likelihood method to correct for the Signor–Lipps effect first developed by Solow86 and adapted for spatial inference of both first-arrival and extinction patterns87. While described in more detailed elsewhere87, we briefly summarize the approach here.

To correct for the inherent spatial bias of dates in a landscape, let x1,…xn be the spatial locations of n dated specimens in an area W and a1,…an their respective ages. The estimated average age M(x) of a putative date at a given location x is based on a standard kriging procedure88 derived from the spatial covariance between the age of two dated specimens as a function of their respective pairwise distance, so that:

$$hat Mleft( x right) = mathop {sum}limits_{i le n} {w_ileft( x right)a_i}$$

(7)

where (w_1left( x right), ldots w_nleft( x right)) follows (mathop {sum}nolimits_{i le n} {w_ileft( x right) = 1}) and minimizes

$$mathop {sum}limits_{i le n} {w_ileft( x right)gamma left( {x_i – x_j} right) + mu = gamma (x – x_j)}$$

(8)

for j ≤ n, with μ being a Lagrange multiplier so that (mu = mathop {sum}nolimits_{i le n} {gamma (x_i – x)}) and γ is the variogram:

$$gamma left( u right) = frac{1}{2}Eleft( {aleft( z right) – aleft( {z + u} right)} right)^2 = sigma ^2 – cleft( u right)$$

(9)

where a(z) is the age a of a specimen found at a given location z (with zW), σ2 is the variance of a(z) and c(u) is the covariance between a(z) and a(z + u), with any two locations in W separated by distance u.

We then modified Solow’s method89 to correct for taphonomic bias, which assumes initially that the distribution of ages through time is uniform between a given age A0 when individuals are assumed to be present, and the date of arrival A. For n ages of a given time series at a given location, the estimated terminal age (hat A) is therefore:

$$hat A = A_0 + frac{{n + 1}}{n}{max} _{i}left( {a_i – A_o} right)$$

(10)

To integrate this method into a spatial context, we estimated a preliminary age Ap across space assuming (hat Mleft( x right)) follows a stationary random field:

$$hat A_pleft( x right) = 2hat Mleft( x right) – A_0$$

(11)

But this generates a spatial bias (hat A_pleft( x right) – A(x)), in every (hat A_pleft( x right)), so we applied a simulation-based, spatial-bias-correction procedure90 to estimate the bias generated by Eq. (11) at each x across W. The first step assumes that (hat A_pleft( x right)) is the ‘true’ date of the terminal event in x. Based on these (hat A_pleft( x right)), we generated k age samples (a^{(k)} = (a_1^{left( k right)}, ldots ,a_n^{(k)})) at the same locations x1, … xn following the same spatial pattern and characteristics as the dated record and sampled independently from a uniform distribution on ([A_{0,}hat A_pleft( {x_i} right)]). We then inferred (hat A^{(k)}(x)), the timing of the terminal event for the k new simulated time series and calculated an estimated total bias (hat Bleft( x right)) across all k ages:

$$hat Bleft( x right) = frac{1}{k}mathop {sum}limits_k {hat A^{left( k right)}left( x right) – hat A_pleft( x right)}$$

(12)

The final estimate of the timing of the terminal event of interest (hat Aleft( x right)) is the distribution of the preliminary dates (hat A_pleft( x right)) for every location x corrected by (hat Bleft( x right)), such that:

$$hat Aleft( x right) = hat A_pleft( x right) – hat Bleft( x right)$$

(13)

Because archaeological age estimates ai are always associated with an inherent dating uncertainty σi, we assumed that age uncertainties are Gaussian and independent91 so that the probability density of the estimated age of the terminal event A(x) follows:

$$int_{{it{epsilon }}_1,…,{it{epsilon }}_n} {A_{left{ {a_1 + {it{epsilon }}_1, ldots ;a_n + {it{epsilon }}_n} right}}(x)prod _ig_ileft( {{it{epsilon }}_i} right)d{it{epsilon }}_1 ldots d{it{epsilon }}_n}$$

(14)

where gi = the density of the Gaussian random variable with mean 0 and variance (sigma _i^2), and (A_{left{ {a_1 + {it{epsilon }}_1,, .., a_n + {it{epsilon }}_n} right}}(x)) = the final estimate at a given location x for a time series of age (a_1 + {it{epsilon }}_1,, .., a_n + {it{epsilon }}_n) located at x1,..xn, respectively. We applied the same Cook and Stefanski bias-correction procedure so that the k ages are independently sampled from a uniform distribution on ([A_{0,}hat A_pleft( {x_i} right)]) at the same locations x1, … xn following the same spatial pattern and characteristics as the dated record. This gives (a^{(k)} = (a_1 + {it{epsilon }}_1^{(k)},, .., a_n + {it{epsilon }}_n^{(k)})) with ({it{epsilon }}_i^{(k)}) independently sampled as a function of the probability density described in Eq. (14) to account for the dating uncertainty associated with each age. We then use the terminal ages (hat A^{(k)}(x) = A_{left{ {a_1^{left( k right)},,..,, a_n^{left( k right)}} right}}left( x right)) to estimate the bias in Eq. (12) and apply this to provide a corrected timing of (hat Aleft( x right)) for every x following Eq. (13).

Global sensitivity analysis

We designed a global sensitivity analysis to provide robust sensitivity measures of the probability of the time to saturation of the entire Sahul continent to variation in the underlying parameters of our stochastic model54,92; this analysis does not repeat the scenario-testing parameters (i.e., time of entry, point(s) of entry, KPp relationship). For this global sensitivity analysis, we used the initial scenario parameters of a 50-ka entry at the southern route, the rotated parabolic relationship between K and Pp, and assuming a founding population size stochastically sampled between 1300 and 1500 people for the entry point2.

Here, we ran the cellular-automaton spatial model 1000 times, randomly sampling 12 of its parameters uniformly for each iteration based on a Latin hypercube-sampling protocol54. We set the 12 parameters to be sampled with ±50% variation on the median value used in the model (except for Ncat and max N/K with a maximum upper bound of 0.99, and for maximum Dcell—see below); these 12 parameters were: (i) the maximum generational rate of population increase rm used to parameterize the phenomenological population-dynamics model per cell (range: 0.10–0.31), (ii) the minimum maximal dispersal distances Dm estimated from the allometric prediction (11–34 km), (iii) the cell-based maximum dispersal distance modifier (max Dcell), ranging from 1× to 5× the value set in the original model (10 cells), (iv) the cell-based minimum viable population size NMVP (50–150 individuals) below which we set (v) an additional mortality parameter MMVP (0.1–0.3), (vi) the hydrological resistance parameter Ω invoking landscape-scale resistance to movement only in the driest areas of Sahul per generational time step (1.5–4.5), (vii) the beta-resampled mean mortality of a cell during a catastrophe event Mcat (0.38–0.99), (viii) the beta-resampled proportion of people moving between cells when a migration event occurs Pmig (0.17–0.50), the beta-resampled (ix) minimum and (x) maximum ratios of N/K per cell invoking an emigration event (0.15–0.45 and 0.35–0.99, respectively), (xi) a resistance modifier R that modified the relationship between landscape ruggedness and maximum dispersal probability (0.5–1.5) and (xii) the population threshold Nfirst above which we determined a cell to be occupied for the calculation of the date of first arrival in a cell (50–150 individuals).

We chose to summarise the output of each of these 1000 parameter-sampled runs of the spatial model as the time taken to achieve continental saturation (i.e., the number of years taken from initial entry to occupy every cell in Sahul). In a separate analysis, we then tested the influence of the per-model run parameter values (predictors) on the time to continental saturation (response) using a boosted-regression tree93 emulator with the function gbm.step in the dismo R library94. Here, we set the error distribution family as Gaussian, the bag fraction to 0.75, the learning rate to 0.008, the tolerance to 0.0001, the maximum number of trees to 10,000 and the tree complexity to 2 (first-order interactions only). To assess the relative contribution of each of the 12 randomly sampled parameters to the time to spatial saturation, we calculated the boosted-regression tree metrics of relative influence54.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.


Source: Ecology - nature.com

Q&A: Vivienne Sze on crossing the hardware-software divide for efficient artificial intelligence

China’s transition to electric vehicles