in

On track to achieve no net loss of forest at Madagascar’s biggest mine

Study site and context

Ambatovy is a very large nickel, cobalt and ammonium sulphate mine in central-eastern Madagascar owned by a consortium of international mining companies50. It represents the largest ever foreign investment in the country24 (US$8 billion by 201650) and a substantial source of fiscal income49. In 2018, the company contributed ~US$50 million in taxes, tariffs, royalties and other payments49 and employed >9,000 people (93% of whom were Malagasy)51. Commercial production began in January 201424 (Supplementary Fig. 1). As key components in batteries, supplies of nickel and cobalt are critical to the green energy transition and demand for these metals is predicted to increase notably in future52.

The mining concession covers an area of 7,700 ha located in the eastern rainforests of Madagascar (Fig. 1) which have very high levels of biodiversity and endemism53,54. After avoidance and minimization measures were applied (Supplementary Methods) the mine was predicted to clear or substantially degrade 2,064 ha of high-quality natural forest at the mine footprint and upper pipeline24. Any impacts on plantations or secondary habitat are not included in this estimate. Losses at the impact site were not discounted in relation to a background rate of decline, meaning that the company took responsibility for the full area of forest lost25. Independent verification by our team (by measuring the size of the mine footprint on Google Earth) confirms the extent of forest loss at the mine footprint (Supplementary Fig. 2). Clearance of the footprint accounts for most of the forest loss associated with the mine as losses associated with the pipeline are small54.

Ambatovy aims to generate biodiversity gains to offset the mine-induced losses by slowing deforestation driven by shifting agriculture elsewhere26. To this end the company designated four sites, totalling 28,740 ha, to be protected as biodiversity offsets; Ankerana, Corridor Forestier Analamay-Mantadia (CFAM), the Conservation Zone and Torotorofotsy54 (Fig. 1). The offsets are considered like-for-like30 and were selected on the basis of similarity to the impact site in terms of forest structure and type, geology, climate and altitude24. The large combined area of the offsets relative to the impacted area was designed to allow flexibility, account for uncertainty and incorporate as many of the affected biodiversity components as possible24. Ankerana is the flagship offset, selected on the basis of its size, connectivity to the Corridor Ankeniheny-Zahamena (CAZ) forest corridor and the presence of ultramafic outcrops thought to support the same rare type of azonal forest lost at the mine site54. Extensive surveys conducted within Ankerana to establish biological similarity concluded the offset to be of higher conservation significance than the forests of the mine site due to the presence of rare lowland tropical forest24.

The Conservation Zone is directly managed by the company, given its location within the concession area, while the other offsets are managed in partnership with local and international NGOs24,25. Ambatovy funds the management of Ankerana by Conservation International and local NGO partners (although before 2015 Ankerana was directly managed by Ambatovy via a Memorandum of Understanding with Conservation International24), supports BirdLife partner Asity with the management of Torotorofotsy and a number of local NGOs including Voary Voakajy25 are involved in CFAM26. The company is also working to secure formal, legal protection for CFAM26 as part of a proposed Torotorofotsy–CFAM complex new protected area (although progress on this has stalled).

Overview of methods

To estimate the impact of the offsets on deforestation and determine whether this has prevented enough deforestation to offset forest loss at the mine site, we combined several complementary methods for robust impact evaluation. First, we used statistical matching to match a sample of pixels from each biodiversity offset to pixels from the wider forested landscape with similar exposure to drivers of deforestation. Then we used a site-based difference-in-differences regression for each matched offset–control sample and a fixed-effects panel regression on the pooled data, to estimate the effect of protection. We systematically explored how arbitrary modelling choices (including the statistical distance measure used in matching, caliper size, ratio of control to treated units, matching with or without replacement and which, if any, additional covariates were included) affected our inference, exploring the robustness of our results to 116 alternative model specifications.

Matching

The former province of Toamasina was selected as the geographic area from which control pixels were sampled as it encompasses forests of the same type as the concession area with varying degrees of intactness and accessibility. The four biodiversity offsets are located within this province (Fig. 1).

The unit of analysis is a 30 × 30 m2 pixel that was forested in the baseline year 200045,55. It is important that the scale of analysis aligns with the scale at which the drivers of deforestation (in this case, small-scale shifting agriculture) operate56. The median agricultural plot size (from 564 measured plots) in the study region is ~36 × 36 m2 (ref. 57). We took a subsample of pixels to reduce computational effort while maintaining the capacity for robust statistical inference58,59. We used a grid-based sampling strategy ensuring a minimum distance between sample units to reduce spatial autocorrelation60 and equal coverage of the study area58. A 150 × 150 m2 resolution grid, aligned to the other 30-m resolution data layers (Fig. 1c), was overlaid on the province and the 30 × 30 m2 pixel at the centre of each grid square was extracted to produce a subsample of pixels that are 120 m away from their nearest neighbour. The 120 m is larger than the minimum distance between units used in another matching study in Madagascar (68 m; ref. 59) but smaller than that used in other studies (200 m; ref. 61) and so strikes an appropriate balance between the avoidance of spatial autocorrelation and maximizing the possible sample cells.

Protected areas in the study area managed by Madagascar National Parks were excluded from our control sample as they are actively managed and therefore do not represent counterfactual outcomes for the biodiversity offsets in the absence of protection (Fig. 1). However, control pixels were sampled from within the CAZ new protected area as legal protection was only granted in 2015 and resources for management are limited and thinly spread62. Additionally, Ankerana and parts of CFAM overlap with the CAZ and would have experienced the same management, and likely trajectory, as the rest of the CAZ, had they not been designated biodiversity offsets. Areas within 10 km of an offset boundary were excluded from the control sample to reduce the chance of leakage (where pressures are displaced rather than avoided) biasing results17,29. The 10 km was selected as it is a commonly used buffer zone within the literature17,58.

To test for leakage effects, we used Veronoi polygons to partition the buffer area for CFAM, the Conservation Zone and Torotorofotsy (which overlap) into three individual buffer areas according to the nearest offset centroid and took a subsample of pixels from each (Fig. 1). Areas that overlapped with the established protected areas of Mantadia National Park and Analamazotra Special Reserve were excluded from the buffer zones.

The outcome variable is the annual deforestation rate sourced from the Global Forest Change (GFC) dataset34. Following Vieilledent et al.45 these data were restricted to only include pixels classed as forest in a forest cover map of Madagascar for the year 200045,55, reducing the probability of false positives (whereby tree loss is identified in pixels that were not forested). The resulting tree loss raster was snapped to the forest cover 2000 layer to align cells, resulting in a maximum spatial error of 15 m. The GFC product34 has been shown to perform reasonably well at detecting deforestation in humid tropical forests63. In the north-eastern rainforests of Madagascar, Burivalova et al.39 found that GFC data performed comparably to a local classification of very high resolution satellite imagery at detecting forest clearance for shifting agriculture (although it was not effective at detecting forest degradation from selective logging). As clearance for shifting agriculture is considered the principal agent of deforestation in the study area22 and the forests of the study area are tropical humid (>75% canopy cover), the GFC data are an appropriate tool for quantifying forest loss. Although recent evidence suggests that GFC data may have temporal biases64, this phenomenon affects our control and treated samples equally and so is unlikely to impact our results.

The choice of covariates is extremely important in matching analyses. They must include, or proxy, all important factors influencing selection to treatment and the outcome of interest so that the matched control sample is sufficiently similar to the treated sample in these characteristics to constitute a plausible counterfactual, otherwise the resulting estimates may not be valid33. On the basis of the literature and a local theory of change we selected five covariates that we believe capture or proxy for the aspects of accessibility, demand and agricultural suitability that drive deforestation in the study area22,59,65,66. These are slope, elevation, distance to main road, distance to forest edge and distance to deforestation (Supplementary Methods). These five essential covariates comprise the main matching specification and form the core set used in all alternative specifications that we tested in the robustness checks. We also defined five additional variables (annual precipitation, distance to river, distance to cart track, distance to settlement and population density) and tested the effect of including these in the robustness checks. The additional covariates were so defined because they were of poorer data quality (population density and distance to settlement), correlated with an essential variable (annual precipitation and population density) or simply considered less influential (distance to river and distance to cart track; Supplementary Methods).

Statistical matching was conducted in R statistics using the MatchIt package v.4.1 (ref. 67). To improve efficiency and produce closer matches we cleaned the data before matching to remove control units with values outside the calipers of the treated sample in any of the essential covariates (see Supplementary Methods for caliper definition). Following the recommendations of Schleicher et al.68 we tested several matching specifications and selected the one that maximized the trade-off between the number of treated units matched and the closeness of matches as the main specification (Supplementary Table 7). This was 1:1 nearest-neighbour matching without replacement, using Mahalanobis distance and a caliper of 1 s.d. This specification produced acceptable matches (within 1 s.d. of the Mahalanobis distance) for all treated units within all offsets. The maximum postmatching standardized difference in mean covariate values between treated and control samples was 0.05, well below the threshold of 0.25 considered to constitute an acceptable match69. This indicates that, on average, treated and control units were very well matched across all covariates.

Matching was run separately for each offset. The resulting matched datasets were aggregated by treated status (offset or control) and year to produce a matrix of the count of pixels that were deforested each year (2001–2019) in the offset and the matched control sample. Converting the outcome variable to a continuous measure of deforestation avoids the problem of attrition associated with binary measures of deforestation and is better suited to the framework of the subsequent regressions70.

Robustness checks

Statistical matching requires various choices to be made68, many of which are essentially arbitrary. There therefore exists a range of possible alternative specifications that are all a priori valid (although some may be better suited to the data and study objectives69) but which could influence the results20,28. We tested the robustness of our results to 116 different matching model specifications (Fig. 4). First, we tested the robustness of the estimates to the use of three alternative matching distance measures (Mahalanobis, standard propensity score matching using generalized linear model regressions with a logit distribution and propensity score matching using RandomForest), three different calipers (0.25, 0.5 and 1 s.d.), different ratios of control to treated units (one, five and ten nearest neighbours) and matching with/without replacement. Holding the choice of covariates constant (using only the essential covariates), the combination of these led to the estimation of 54 different models. Second, we tested the robustness of results to the inclusion of the five additional covariates. Holding the choice of distance measure and model parameters constant, we constructed 31 models comprising all possible combinations of additional covariates with the core set of essential covariates. Finally, we explore the robustness of results for 31 randomly selected combinations of distance measure, model parameters and additional covariates. All 116 specifications are a priori valid, assuming that the covariates capture or proxy for all important factors influencing outcomes, but may fail to satisfy the parallel trends condition or produce matches for insufficient numbers of treated observations (<10%), rendering them a posteriori invalid. It remains important to test the assumptions of the alternative models as failure to do so may lead to erroneous conclusions about effect size and direction being drawn from invalid models. Results are presented through specification graphs based on codes developed in Ortiz-Bobea et al.71.

Additionally, we tested the robustness of our results from the site-based difference-in-differences regressions to alternative temporal specifications using an equal number of years before and after the intervention (8 yr for Ankerana and the Conservation Zone, 6 yr for CFAM and 5 yr for Torotorofotsy) and dropping individual years from the analysis. This did not change the significance or magnitude of our results (Supplementary Table 10 and Supplementary Figs. 6 and 7).

Outcome regressions

Deriving estimates of causal effect from statistical comparisons of outcomes between treated and control samples relies on the assumption that the latter is a robust counterfactual for the former. In a difference-in-differences analysis this assumes that, in the absence of the intervention, the treated sample would have experienced the same average change in outcomes over the before–after period as the control sample72. Parallel trends in outcomes between treated and control before the intervention are an essential prerequisite for this assumption. We tested this for each matched offset–control dataset using the following formula:

$$begin{array}{l}{{{mathrm{log}}}}({mathrm{Count}};{mathrm{of}};{mathrm{deforestation}} + 1)_{i,t} = quad beta _0 + beta _1{mathrm{Year}}_t + beta _2{mathrm{CI}}_i + beta _3({mathrm{Year}} times{mathrm{CI}}_{it})+ in _{i,t}end{array}$$

(1)

where the outcome is the log(y + 1)-transformed count of deforestation within sample i at year t and CI is a binary variable indicating whether the observation is from the offset (1) or control (0) sample.

Parallel trends in deforestation between offset and matched control samples in the years before the intervention were present for all offsets except for CFAM (Supplementary Fig. 5). Consequently, CFAM could not be used in the site-based difference-in-differences analysis. However, its effect is still captured in the results from the fixed-effects panel regression as this is not based on an identifying assumption of parallel trends between groups in the pretreatment period72.

To estimate the impact of protection within each individual offset, we ran an ordinary-least-squares difference-in-differences regression for each matched offset–control dataset using the following formula:

$$begin{array}{l}{{{mathrm{log}}}}({mathrm{Count}};{mathrm{of}};{mathrm{deforestation}} + 1)_{i,t} = beta _0 + beta _1{mathrm{BA}}_t + beta _2{mathrm{CI}}_i + beta _3({mathrm{BA}} times{mathrm{CI}})_{i,t}+ in _{i,t}end{array}$$

(2)

where BA and CI are binary variables indicating whether the observation occurred before (0) or after (1) the intervention, in the offset (1) or control sample (0). Given the non-normal properties of count data and the presence of zero values, a log(y + 1) transformation was applied to the outcome variable70,73. The coefficient of BA × CI and the corresponding confidence intervals were back-transformed (Supplementary Table 9) to obtain an estimate of the percentage difference in average annual deforestation between the offset and the matched control sample after protection, controlling for pre-intervention differences between samples (that is, the estimated counterfactual).

To estimate the overall impact of Ambatovy’s biodiversity offset policy at reducing deforestation we pooled the data for all four offsets and their corresponding matched control samples and ran a fixed-effects panel regression. The pooled data (n = 152) comprise an observation for each site (i = 8, 4 offset and 4 control) for each year (t = 19). The fixed-effects panel regression quantifies the effect of protection on the log-transformed count of deforestation controlling for site and year fixed effects, according to the following formula :

$${{{mathrm{log}}}}( {mathrm{Count}};{mathrm{of}};{{mathrm{deforestation}} + 1} )_{i,t} = beta _0 + beta _1{mathrm{Tr}}_{i,t} + propto _i + gamma _t + {it{epsilon }}_{it}$$

(3)

where Tr is a binary measure indicating the treated status of sample i in year t (Tr = 1 for observations from offset sites in the years following protection and 0 for all other observations), i and γt represent site and year fixed effects, respectively, and it represents the composite error. The coefficient of interest (β1) and the associated confidence intervals were back-transformed to obtain the percentage difference in average annual deforestation across all four biodiversity offsets following protection (the treatment effect).

Evaluating deforestation leakage

To determine whether protection of the four biodiversity offsets simply displaced deforestation into the surrounding forested landscape, we repeated the matching and outcome regressions with the subsample of units from each buffer zone assigned as the treated group17,58 (Supplementary Results).

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.


Source: Ecology - nature.com

Genotyping-in-Thousands by sequencing panel development and application for high-resolution monitoring of introgressive hybridization within sockeye salmon

Seascapes of fear and competition shape regional seabird movement ecology