Disturbance detection
We used a well-established spectral-temporal segmentation method, Landsat-based Detection of Trends in Disturbance and Recovery (LandTrendr), to detect disturbances within the Google Earth Engine (GEE) cloud-computing platform57,58. The core of the LandTrendr is to extract a set of disturbance-related metrics by breaking pixel-level annual time-series spectral trajectories into linear features using Landsat observations. The LandTrendr has been widely used for change detection in various forest settings, and details about the algorithms can be found in previous publications57. Here we briefly described the key steps in generating the year and type of disturbances in China’s forests using the LandTrendr within the GEE platform. The overall analytic flow can be found in Supplementary Fig. 10.
First, we generated annual spectrally consistent time-series data by using all available, good quality (cloud cover ≤ 20) Tier 1 Landsat 5 (Thematic Mapper), Landsat 7 (Enhanced Thematic Mapper Plus), and Landsat 8 (Operational Land Imager) images acquired during the peak growing seasons (June 1—September 30) from 1986 to 2020. The peak growing seasons were selected to exclude compounding influences from ice, snow, and soil, and to maximize the spectral changes after forest disturbances. To tackle the spectral inconsistency among Landsat sensors, we harmonized spectral values via linear transformations according to band-respective coefficients presented in59. Clouds, cloud shadows, snow, and water were masked out using the Fmask algorithm60. The annual band composites at 30-meter spatial resolution during 1986–2020 were computed using the Medoid method61.
Secondly, we ran the LandTrendr using five spectral indices, including two spectral bands (shortwave infrared I and II that were B5 and B7), tasseled cap wetness (TCW), normalized burn ratio (NBR), and normalized difference vegetation index. These five indices were effective indictors to represent vegetation greenness and structures, and were commonly used for detecting changes in forest disturbance and recovery62. For each spectral index, the LandTrendr produced a set of parameters to describe a possible disturbance event at the pixel level, including spectral values at pre-disturbance level (preval), magnitude of change (mag), duration (dur) and rate of change (rate), and the signal-to-noise ratio (dsnr) (n = 5). Using these five spectral indices, we generated a stack of disturbance-related parameter layers (n = 25, 5 spectral indices × 5 parameters), which were later used to detect and classify disturbances using machine learning models derived from reference data (described below).
Disturbance classification
Reference data
High-quality consistent reference data is key to train and classify disturbance types. To do so, we generated a total of 31225 reference points using a hierarchical approach. We first generated a large number of potential disturbance points using forest loss data from 2001 to 20203. Then we separated fire disturbances from non-fire disturbances by overlaying MODIS burned area (BA) with potential disturbance points following the procedure used by63. Specifically, fire disturbances were determined if the MODIS BA data coincided with the Landsat-derived forest loss for the fire year and 2 years postfire (i.e., t + 0, t + 1, t + 2) to account for delayed post-fire tree mortality. Following this step, we derived points as potential disturbances that consisted of fires and non-fire disturbances (including forest conversion to other land use types and silvicultural practices at various intensities). We also generated roughly the same number of points that experienced no disturbances (e.g., persistent forests), which were determined by selecting pixels with very few changes in spectral indices. These reference points, including fire, non-fire disturbances, and persistent forests, were then used to sample the time-series spectral data from 1986 to 2020. Finally, time-series spectral data from each reference point were visually checked to make sure they accurately represented disturbance events. This process resulted into a total of 31225 reference data points, including 2356 fire disturbance points, 13,242 non-fire disturbance points, and 15,627 no disturbance points (persistent forests) (Supplementary Fig. 2).
Random forest classification
We used machine learning modeling to classify each pixel into fire disturbance, non-fire disturbance, or no disturbance. The reference data points were used to sample the LandTrendr-derived disturbance-related parameter layers described above, which resulted into a dataset consisting of disturbance types. We divided the dataset into 70% of training data, and 30% as validation data. Using the training data, a Random Forest (RF) model was trained to classify each reference point into fire, non-fire disturbance, or no disturbance. Our RF approach showed that short-wave infrared (SWIR)-based moisture indices (e.g., B7, TCW) were strong predictors for detecting forest disturbances (Supplementary Fig. 11) likely because of their sensitivity to vegetation water content and canopy structure64. Finally, we applied the trained RF model to the full classification stack to consistently map the disturbance types from 1986 to 2020 across China’s forests, assuming that the spectral trajectories derived from reference data period 2001–2020 can be extrapolated to the whole mapping period 1986–2020. However, note that our approach was meant to detect relatively acuate and discrete disturbances that caused canopy opening, rather than subtle changes of forest structure or composition resulted from low intensive silvicultural practices and chronic disturbances.
Year of disturbance
We used the LandTrendr to determine the year of disturbance as the onset of magnitude of spectral change. Since we ran LandTrendr on five spectral indices, there were five possible years of disturbance for each pixel. Thus, we determined the year of disturbance using the median value from at least three different indices (i.e., NDVI, NBR, TCW, B5, B7). In this way, we only kept pixels that were detected as disturbances using at least three indices, thus reducing commission errors. The year with the greatest spectral changes generated by the LandTrendr often had an accuracy within 3 years11. A confidence level was also assigned to each disturbed pixel based on numbers of indices which showed possible disturbance events. Specially, low, medium, and high confidence were assigned if the disturbance was detected by three, four, or five spectral indices, respectively.
Validations
We validated the disturbance map at the pixel and national levels. At the pixel level, we validated the final map using the validation sub-sample described in the previous section. We derived a confusion matrix to report user’s and producer’s accuracy (Supplementary Table 1) as the main accuracy assessment metrics. At the national level, we compared forest disturbance detected in this study to available existing dataset. Specifically, we compared the area of forest fire disturbance between our study and the national fire records during 2003–2009 (Supplementary Fig. 5). We compared the disturbance rates between our study and Landsat-derived global forest cover changes from 2001 to 20193 (Supplementary Fig. 4).
Post-processing
We applied a series of spatial filters to minimize the unrealistic outliers from two potential sources of uncertainty, including speckle in time-series spectral trajectories or misregistration among images. This may lead to individual pixel or small patches including only a few pixels, which were (a) detected as disturbances, thus increasing the commission errors, or (b) not detected as disturbances, while their surrounding pixels were mostly disturbed, thereby increasing the omission errors. To address the issue (a), we removed all single-pixel disturbance patches through setting the minimum mapping unit as two 30 × 30 m2 pixels (0.18 ha). To address the issue (b), we applied a 3 by 3 moving window to fill holes through assigning the year of disturbance based on the years in the surrounding pixels. Finally, we smoothed the year of disturbance by assigning the center pixel using majority rules from surrounding pixels within the 3 by 3 windows, thus accounting for artefacts associated with uncertainties in the correct identification of the disturbance year.
Characterizing disturbance regimes and their trends
We characterized the disturbance regime using five indicators within each 0.5° grid cell (n = 1946) across China’s forests based on annual forest disturbance maps generated from the previous step. Within each grid cell, we calculated (1) total annually disturbed forest area (km2 yr−1), (2) percentage of forest disturbed annually (% yr−1), as annual disturbed forest area divided by the total forested area, (3) disturbance size (ha), as the number of disturbed pixels for each individual patch using an eight-neighbor rule, (4) disturbance frequency (# of patches per 1000 km2 forested area each year), as the number of disturbance patches per year divided by the total forested area, (5) disturbance severity (ΔNDVI = NDVIt−1 − NDVIt+1), as magnitude of NDVI change 1 year before and 1 year after disturbance, obtained from the LandTrendr analysis. We used (1) and (2) to characterize the disturbance rate, and (3)–(5) to describe the patch characteristics. The (2) and (4) were normalized by forest area within each grid cell, thus making them comparable among grid cells. For (3)–(5), we only calculated the patch size >0.45 ha (five 30 × 30-m2 pixels), because patches <0.45 ha only contributed <10% of disturbance area, while greatly increasing the computation demand. Larger patches were also likely to be detected more accurately, having greater effects on forest dynamics and functions. In our analysis, the area was calculated by numbers of pixels multiplied by pixel size (0.09 ha).
We reported the five disturbance regime indicators at each grid cell, and aggregated patch-based metrics (disturbance size, frequency, and severity) within each grid cell. We visualized the spatial patterns of forest disturbance regime using arithmetic mean and reported the statistics using both arithmetic mean and median. Trends in five disturbance regime indicators at each grid cell were quantified using the Mann Kendall trend test, which is a non-parametric measure of monotonic trends in time series insensitive to outliers.
We also assessed how variations in satellite image availability affected the trend of disturbance. There is an increasing availability of Landsat images per year as Landsat mission evolves (Supplementary Fig. 1). Although it provides us more opportunities to detect changes on the ground, this could potentially cause a false positive trend13,14. The number of available Landsat images was much lower from 1986 to 1999 but increased with time. As a result, there has been great interannual variability in disturbance area, with a non-significant decreasing trend. However, the number of available Landsat images has doubled since 2000, except that the number of available Landsat images was especially low in 2000 and 2012 due to the decommission of TM and ETM + sensors, respectively (Supplementary Fig. 1). This resulted in a relatively small interannual variation in disturbance area and a decreasing trend. Although more detailed analysis (see below) indicated positive trends of Landsat observations contributed to the increasing disturbance rate during 1986–2020, our results showed that its influence was relatively small compare the other factors (Fig. 4). Furthermore, the declining rate of disturbance would be higher if the density of Landsat observation was consistent during 1986–1999, because the denser Landsat observations in the later observation period could create false positive trends.
Underlying drivers of changing patterns of forest disturbance
To understand the underlying drivers of changing patterns of forest disturbance, we used the generalized linear model (GLM) to explore the effects of climatic factors, biophysical factors (topography, vegetation, soil), social-economic factors (e.g., management, pullulation density, road density, and distance to the nearest road), and variability in Landsat observations on annually disturbed forest area at each 0.5-degree grid cell. Climatic factors were used to investigate if background climates had an influence on disturbance patterns. We summarized annual mean air temperature (MAT), mean annual precipitation (MAP), and vapor pressure deficit (VPD) from 1985 to 2020 using 4-kilometer, monthly gridded TerraClimate data65. VPD was then only included in the analysis due to its collinearity with MAT and MAT. Forest management variables included percent of forest management area and the status of ecological projects. The percent of forest management area was calculated as percent of pixels with forest management activities divided by all forest pixels at 0.5° grid cell, using a newly derived forest management dataset for 201566. The boundary of ecological project was used to determine whether disturbance patches were located in the ecological project regions (i.e., 1-within the ecological project region, 0-outside ecological project region, detailed descriptions of ecological projects were provided below). Population density (persons/km2), road density (km/km2), and distance to the nearest road (km2) were summarized at 0.5° grid cell. Trends for Landsat observations were calculated using annual availability of all Landsat observations (Supplementary Fig. 1). Biophysical factors included topography (e.g., elevation, slope), vegetation (e.g., NDVI as a proxy for rate of growth), and soil (e.g., soil organic content, soil bulk density).
The GLM was fitted using annually disturbed forest area as dependent variables, and climatic factors, biophysical factors, management, and availability of Landsat observations as independent variables. The GLM was fitted using the glm function of the stats package in R. A parsimonious model was selected using the stepwise algorithm based on Akaike Information Criterion (AIC). The standardized regression coefficient from the GLM can be interpreted as the relative influences of independent variables. Partial plots were used to indicate direction of independent variables (Fig. 4).
Forest restoration policy and its effects on disturbances
Forest management policy has undergone substantial changes since the establishment of P.R. China in 1949. Before 1970s, excessive timber harvest decreased forest covers, degraded forest conditions, and caused serious ecological and environmental consequence, such as soil erosion, sandstorm, and flood. To restore the degraded forest ecosystems, Chinese government has implemented several national key ecological restoration projects since 1987, including Three-North Shelter Forest Program (TNSFP, also known as Green Great Wall), Yangtze River and Zhujiang River Shelter Forest Projects (RSFP), Natural Forest Protection Project (NFPP), The Grain for Green Program (GGP, also known as China’s Sloping Lands Conversion Project). The TNSFP, initiated in the late 1970s, is an afforestation program, and spans half of northern China. It aims to improve environmental conditions, reduce the soil erosion, and promote the forest/grass-related products in the Three-North regions. The RSFP, initiated in 1989, is a forest restoration program, and covers the Yangtze River and Zhujiang River watersheds. It aims to reduce the soil erosion, rocky desertification, and occurrence of flood in south China. The NFPP, initiated in 1999, is a forest protection/conservation program to conserve China’s natural forests, with the objectives of conserving biodiversity, protecting the water quality, preventing soil erosion and desertification, and reducing the likelihood of floods and other natural disasters associated with deforestation. Within the NFPP regions, the major restoration strategies include decreasing and adjusting the logging intensity, and accelerating afforestation in ecologically sensitive regions, especially along the Yangtze River and the Yellow River basin. The GCP, initiated in 2000, is a nation-wide ecological compensation program, which encourages farmers to convert croplands in hilly areas (>20 degree) to forests. Other than the forest-related programs mentioned above, there are also a few other ecological projects, including the Beijing–Tianjin Sand Source Control Project and Returning Grazing Land to Grassland Project. However, these two projects focus on dryland ecosystems and thus were not included in our analysis24,46.
To further assess the effects of forest management policy on disturbance regimes, we compared the fraction of positive and negative trends of annually disturbed forest area at 0.05 significance level (i.e., p < 0.05) within and outside the ecological restoration projects boundary (Supplementary Fig. 8). A higher proportion of negative trends within the ecological restoration projects boundary indicated a decreasing trend of disturbance rate caused by the effective implementation of the project. The boundaries of these projects were digitized at the province level from maps obtained from the administrative agencies.
Mapping forest expansion
Besides the mapping of disturbances within forests, we also mapped the pattern and rate of forest expansion via reforestation/afforestation outside of forests to fully capture disturbance regimes that regulated forest changes across China. Since the LT-GEE cannot map forest extent dynamics, we derived the forest extent (defined as canopy cover >20%) by merging the 30 m annual land cover map from 1990–201930 and an average tree canopy cover map from AVHRR (1985–2016)32, MODIS (2000–2020)67 and Landsat (2000–2020)3. The annual forest expansion was defined as newly gained forests on previously non-forested land use types and remained forested for the rest of study period. The rate of forest expansion (km2 yr−1) was mapped with each 0.5° grid cell (n = 1946) across China’s forests (Supplementary Fig. 7).
Forest mask
Forest mask in 1986 was derived using the following procedures. First, a composite tree cover map in 1986 (TCcompostite1986) was obtained by merging the tree cover map in 1986 (TC1986) using observations by Advanced Very High-Resolution Radiometer (AVHRR)32 and the tree cover map in 2000 (TC2000) using observations by Landsat sensor3. The percent of tree cover in TCcompostite1986 was determined as the maximum tree cover in TC1986 or TC2000 (that was TCcompostite1986 = max(TC1986, TC2000)). In this way, we could capture the possible forest disturbances that occurred before 2000 (e.g., TC1986 > TC2000), and the expansion of forested area from 1986 to 2000 (e.g., TC1986 < TC2000). Then the forest mask in 1986 was defined as TCcompostite1986 > 20% following Liu et al., (2019). We should note that our study area did not include the newly afforested area after 2000. All analyses were performed within the forest mask, thus excluding the potential confounding factors from other land cover types. The description of TC1986 and TC2000 can be found in3,32.
Source: Ecology - nature.com