Field-based tree mortality constraint reduces estimates of model-projected forest carbon sinks
Biogeographic pattern of LOSSThe original forest plot data aggregated at 0.25 degree show large spatial variations (Fig. 1a) across the continents, with the greatest LOSS in Asia & Australia (mean ± 1 SE; 6.5 ± 0.5 Mg ha−1 y−1) > South America (4.9 ± 0.2 Mg ha−1 y−1) and Africa (4.6 ± 0.2 Mg ha−1 y−1) > North America (2.3 ± 0.1 Mg ha−1 y−1 in boreal and 2 ± 0.1 Mg ha−1 y−1 in temperate)36 (Fig. 1b; Supplementary Fig. 5a). This pattern was robust to bootstrapping (1000 iterations) to randomly select 90% of plots for estimating the probability distribution of the mean continental values (Supplementary Fig. 5b). The upscaled gridded LOSS maps generated by our random forest algorithm (see Methods) over the spatial domain of our full datasets shows hotspots of high LOSS in Southern Asia & Australia ( > 6 Mg ha−1 y−1), Northwestern South America (Amazon) ( > 5 Mg ha−1 y−1), and the western coast of North America ( >3 Mg ha−1 y−1)10,36,37,38 (Supplementary Fig. 6a). These patterns were robust to two bootstrapping approaches – based on the sampled biomes of each point feature and also randomly sampling 90% data with replacement (see Methods) (Fig. 2a; Supplementary Fig. 6b). The uncertainty (coefficient of variance – CV; %mean) was generally low ( 10%), despite the larger sample size (n > 500 at 0.25 degree) (Fig. 2b; Supplementary Fig. 6c), likely because of potential effects of forest recovery or regrowth following past disturbance16 as well as the small plot size (i.e., 0.067 ha in each plot)39.Fig. 1: Map of sample locations and biomass loss to mortality (LOSS) data.a Sampling sites. A total of 2676 samples were collected and aggregated into 814 grids at 0.25 degree that were used for geospatial modeling. b The median and interquartile range of LOSS across continents—North America, South America, Africa, and Asia & Australia.Full size imageFig. 2: Map of biomass loss to mortality (LOSS) and its uncertainty across continents.a, b Ensemble mean of LOSS a and its uncertainty (coefficient of variation, b across continents at 0.25 degree derived from forest plot data using the bootstrapped (10 iterations) approach by randomly sampling 90% plots with replacement. c, d Ensemble mean of LOSS c and its uncertainty (coefficient of variation, d across continents at 0.5 degree derived from six dynamic vegetation models (DGVMs, ORCHIDEE, CABLE-POP, JULES, LPJ-GUESS, LPJmL, and SEIB-DGVM). Coefficient of variation was quantified as the standard deviation divided by the mean predicted value as a measure of prediction accuracy. e The difference of LOSS between ensemble mean of DGVMs and ensemble mean of LOSS derived from forest plots data across continents at 0.5 degree, quantified as difference between c and a, whereby LOSS in Fig. 2a is resampled at 0.5 degree.Full size imageDrivers of LOSSMean annual temperature (MAT), aridity index (the ratio of precipitation to potential evapotranspiration), and precipitation seasonality were identified as the dominant predictors of LOSS across continents (Supplementary Fig. 7a), with positive relationships with LOSS (Fig. 3a)10,36. In contrast to local-scale studies40,41, wood density, forest stand density, and soil conditions were poor predictors of LOSS when all data were used. These relationships were largely driven by the spatial pattern of LOSS and climate gradients, whereby LOSS and MAT, aridity index, and precipitation seasonality were high in tropical forests (Supplementary Fig. 8). This motivated us to examine the drivers of LOSS in tropical vs non-tropical biomes (Supplementary Fig. 7b, c; Fig. 3b–d). With a smaller gradient in climate within wet tropical forests, soil properties such as nutrient content and cation exchange capacity (CEC) were significant predictors of LOSS (Supplementary Fig. 7b; Fig. 3b)42. In wet tropical forests, the relationships between soil nutrient content and CEC and LOSS were positive (Fig. 3b) and thus appeared to support the pattern of higher mortality in more productive tropical forests growing over nutrient rich soils42,43. In non-tropical regions, basal area or a competition index based on the degree of crowding within stocked areas44 (see Methods) were the dominant predictors of LOSS, especially in extra-tropical North America (Supplementary Fig. 7c; Fig. 3c, d). This result highlights the role of stand competition in driving the spatial patterns of LOSS44,45. This pattern also supports the existence of a spatial tradeoff between faster growth and higher mortality because of resource limitations or younger death, whereby competition plays the fundamental role13,45. In contrast to other studies15,46, forest age (available in boreal and temperate forests in North America) was not a good predictor of LOSS (Supplementary Fig. 9), likely because of our focus on mature and old-growth forests (i.e., age > 80 and 100 years in boreal and temperate forests, respectively).Fig. 3: Standardized response coefficients (mean ± 95% CIs) between LOSS and dominant environmental drivers.The scales analyzed were at continents a, tropical regions b vs non-tropical regions c, d. The response coefficients were quantified by linear mixed model which account for each plot as a random effect. Panels c and d used basal area and stand density index (SDI) as competition index, respectively. SDI was defined as the degree of crowding within stocked areas and quantified as a function of tree density and the quadratic mean diameter in centimeters. Basal area is strongly correlated with total biomass and higher LOSS in higher basal area may be merely because of its correlations. Thus, we used another competition metrics – SDI to further confirm the role of competition in LOSS. The error bars denote the 95% confidence interval. *P 130%) in western boreal forests in North America (Fig. 2d).Conventional emergent constraintWe first used the conventional emergent constraint approach27 to constrain the projected (2015–2099) NPP and HR across continents. This approach was conducted by building the statistic (linear) relationship between the historical LOSS averaged at forest-plot scale (derived from original plot data of LOSS) or continental scale (derived from the map of LOSS) and projected NPP and HR summed across continents (see Methods and Supplementary Fig. 4 for details). We found that the emergent constraint approach worked well in North America, where the relationship between historical LOSS and projected NPP and HR was significant (the scenario of using original plot data of LOSS: R2 = 0.68 and P = 0.04 for grid-level NPP; R2 = 0.97 and P = 0.0001 for grid-level HR; the scenario of using map of LOSS at continent scale: R2 = 0.7 and P = 0.04 for grid-level NPP; R2 = 0.95 and P = 0.0008 for grid-level HR) (Supplementary Fig. 11a; Supplementary Fig. 12a). This emergent constraint approach was less effective, however, for other continents, where tropical forests are predominant (all P > 0.05; Supplementary Fig. 11b, c, d; Supplementary Fig. 12b, c, d). These results suggest a weak linear relationships when observations are lumped or averaged at broad continental scales for tropical continents, thus highlighting the importance of spatial scale and non-linear relationships in emergent constraint25. We interpret the result that this LOSS emergent constraint works better in North America than in the tropical forests, by a better representation of forest plot distribution and couplings of LOSS and NPP and HR across space in North America.Machine learning constraintTo overcome this limitation, we trained a machine learning algorithm34 to reproduce the emerging relationship between historical LOSS and projected NPP and HR at grid level in each DGVM by incorporating all grid values without or with climate predictors, expressed as NPPpro or HRpro = f(LOSShis) or f(LOSShis, MATpro, MAPpro), respectively, where pro refers to projected variables, his refers to historical variables, and MAT and MAP is mean annual temperature precipitation, respectively (see Methods). The results show consistently positive non-linear relationships between LOSShis and NPPpro or HRpro across DGVMs (Supplementary Fig. 3). Our machine learning algorithms can surrogate well the results of process-based models between the historical LOSS and the projected NPP and HR (R > 0.65 and R > 0.9 in both scenarios without climate effects and with climate effects, respectively; see Methods) (Supplementary Fig. 13). After including the observed LOSShis (derived from LOSS) in the machine learning algorithm, we were able to generate spatially explicit constrained estimates34 of projected NPP and HR, and then compare them with the scenario without the constraint (Supplementary Fig. 14; Supplementary Fig. 15). These patterns essentially show a lower NPPpro or HRpro in locations of overestimated LOSShis in DGVMs, consistent with the positive relationship between LOSShis and NPPpro or HRpro (Supplementary Fig. 3).Our results show that most DGVMs overestimate tree mortality, particularly in tropical regions (Fig. 2c, e). Thus, if modeled mortality is over-estimated, we expect that NPP is over-estimated as well. Ultimately, we used a bootstrap approach to generate 100 maps of mean value of LOSS with its distribution following the values of the average and 2 times of standard deviation of LOSS maps as a conservative constraint (see Methods). Then the 100 maps of mean value of LOSS were used to constrain the projected NPP or HR as ensemble means in our ML constraint and the uncertainty of the constraint was assessed. Our bootstrapping constraint approach by LOSS reduces this common bias of models and decreases projected NPP down to 7.9, 2.3, 2 Pg C y−1 in South America, Africa and Asia & Australia, compared to original NPP values of 9, 2.4, 2.3 Pg C y−1 (Fig. 4a). The reason for this is that NPP or growth is strongly positively correlated with LOSS across space in both inventory data and DGVMs (Supplementary Figs. 2 and 3; Supplementary Fig. 16). The constant mortality parameter used in most models may be too large if modelers have tuned this parameter to obtain reasonable biomass stocks, thus compensating for an overestimate of NPP in absence of modeled competition between individuals and nutrients (e.g. phosphorus) limitations in tropical forests13. Likewise, HRpro showed similar patterns with NPPpro because of coupling of HR and NPP and LOSS at broad spatial and long term scales20,21, despite the likely decoupling of the instantaneous rate of HR and NPP and LOSS at local and short-term scales22,23. Thus, we also constrained a decrease in projected grid-level HR with values of 6.5, 1.9, 1.7 Pg C y−1 in South America, Africa and Asia & Australia compared to 7, 1.9, 1.8 Pg C y−1 in the original model ensemble (Fig. 4b). Taken together, our results constrain a weaker future tropical forest carbon sink from observation-based LOSS estimates down to 1.4, 0.4, 0.3 Pg C y−1 in South America, Africa and Asia & Australia as compared to 2, 0.5, 0.5 Pg C y−1 in the original models. The projected sink is reduced in particular over the Amazon basin, while North America showed an enhanced future carbon sink (1.1 and 0.8 Pg C y−1 after and before constraint, respectively). The constraint by the machine learning approach significantly reduced the model spread in grid-level NPPpro and HRpro generally in tropical regions and particularly in South America (Fig. 4; Table 1). This was in contrast to the case of constraint at the whole North America scale (Fig. 4; Table 1), presumably because of spatial trade-off or compensation from regions of mortality overestimation (i.e., eastern North America—temperate zones) vs underestimation (i.e., boreal zones). To this end, we further divided the whole North America into temperate and boreal forests and found the significant effects of the ML constraint (Supplementary Fig. 17). These results highlight the importance of spatial scale in the ML constraint approach. We thus recommend accounting for the role of spatial trade-off in our ML constraint approach or using our ML constraint approach at broad spatial scales whereby the effect of spatial trade-off is minimal. We also caution that the bootstrapping (100 times) approach used in our ML constraint increases the sample size and could have increased the significant difference with and without LOSS constraint. Overall, the uncertainty of the ML constraint was low in the bootstrapping approach (Supplementary Fig. 18).Fig. 4: Projected grid-level NPP and grid heterotrophic respiration (HR) across continents.a, b Projected (2015–2099) grid-level NPP a and grid-level HR b across continents quantified by six dynamic vegetation models—DGVMs (ORCHIDEE, CABLE-POP, JULES, LPJ-GUESS, LPJmL, and SEIB-DGVM). The y axes are the minimum, mean, and maximum values in six DGVMs. ‘DGVMs’ refers to the scenario before constraint and ‘DGVMs + Observation’ refers to the scenario after constraint without climate predictors. The constraint was achieved by using the observational maps (n = 100; through a bootstrapping approach; see Methods for details) of LOSS derived from forest plots data to feed into the trained ML (random forest) model. Reported are ensemble means of constraint. The constraint effect was significant when North America were divided into temperate and boreal forests (see results of Supplementary Fig. 17). *P More