Eco-environmental assessment model of the mining area in Gongyi, China
Technical criterion for ecosystem status evaluationOn March 3, 2015, the Ministry of Ecology and Environment of the People’s Republic of China approved the “Technical Criterion for Ecosystem Status Evaluation” as the national environmental protection standard. This standard is based on the former standards released in 2006, and 48 relevant documents from 2006 to 2012 were searched to propose new standards and factor weights based on actual utilization effects and expert guidance. The eco-environment assessment uses a comprehensive index (eco-environmental status index, EI) to reflect the overall state of the regional eco-environment. The indicator system includes the biological abundance index, vegetation coverage index, river density index, land stress index, and pollution loading index. These indexes reflect the abundance of organisms in the evaluated area, the level of vegetation coverage, the abundance of water, the intensity of land stress, and the extent of the pollution load. Each indicator was calculated according to its weight to obtain an eco-environment assessment map (Table 1). All parameters involved in the calculation are derived from this standard.Table 1 Weights of the evaluation indicators.Full size tableThe calculation of the eco-environment status is as follows:$$begin{aligned} EI & = 0.35*biological ; abundance ; index + 0.25*vegetation ; coverage ; index hfill \ & quad + 0.15*river ; density ; index + 0.15*(1 – land ; stress ; index) hfill \ & quad + 0.1*(1 – pollution ; loading ; index) hfill \ end{aligned}$$
(9)
Biological abundance indexThe biological abundance index refers to the number of certain organisms in this area. The calculation method is as follows:$$Biological , abundance , index , = , left( {BI , + , HQ} right)/2$$
(10)
In this formula, BI is the biodiversity index and HQ is the habitat quality index. When the biodiversity index does not have dynamic data updates, the change in the biological abundance index is equal to the change in the HQ.Biodiversity is a general term for the complexity of species and their genetic variation and ecosystems in space over time. Biodiversity plays an important role in maintaining soil fertility, ensuring water quality, regulating the climate, stabilizing the environment, and maintaining ecological balance.The BI method is as follows:$$BI = NPP_{mean} *F_{pre} *F_{tem} *(1 – F_{alt} )$$
(11)
NPPmean is the net primary productivity. Fpre is the annual average precipitation. Ftem is the temperature parameter. Falt is the altitude parameter.NPP refers to the amount of organic matter accumulated per unit area and unit time of green plants. NPP is the remainder of the total amount of organic matter produced by photosynthesis after deducting autotrophic respiration and is usually expressed as dry weight. In this study, the estimation of NPP was based on the absorbed photosynthetically active radiation (APAR) and actual light-use efficiency (LUE) (ε) of the CASA ecosystem model40. The CASA model is a process-based remote sensing model that couples ecosystem productivity and soil carbon and nitrogen fluxes, driven by gridded global climate, radiation, soil, and remote sensing vegetation index datasets41. The model can be expressed generally as follows:$$NPP(x,t) = APAR(x,t)*varepsilon (x,t)$$
(12)
The entire study area is divided into 11,303 pixels on a 30 * 30 m grid. x indicates the location of each pixel, and t indicates time; the data were collected once a month. APAR(x,t) represents the photosynthetically active radiation absorbed by pixel x in that month (gC * m−2* month−1). Ɛ(x, t) is LUE (gC * MJ−1) of the vegetation42.Estimation of the fraction of APAR using RS data is based on the reflection characteristics of the vegetation on the infrared and near-infrared bands. The value of APAR is determined by the effective radiation of the sun and the absorption ratio of the vegetation to the effective photosynthetic radiation. The formula is as follows:$$APAR(x,t) = SOL(x,t)*FPAR(x,t)*0.5$$
(13)
where SOL(x,t) represents the total amount of solar radiation at pixel x in month t, FPAR(x,t) represents the absorption ratio of the vegetation layer to the incident photosynthetically active radiation, and a constant of 0.5 indicates the ratio of the effective solar radiation that can be utilized by the vegetation to the total solar radiation.Since there is a linear relationship between FPAR and NDVI within a certain range, this relationship can be determined according to the maximum and minimum values of a certain vegetation type NDVI and the corresponding FPAR maximum and minimum values.$$FPAR(x,t) = frac{{(NDVI(x,t) – NDVI_{i,min } )}}{{(NDVI_{i,max } – NDVI_{i,min } )}}*(FPAR_{max } – FPAR_{min } ) + FPAR_{min }$$
(14)
where NDVImax and NDVImin correspond to the NDVI maximum and minimum values of the ith planting type, respectively. There is also a good linear relationship between FPAR and the simple ratio index (SR) of vegetation, which is represented by the following formula:$$FPAR(x,t) = frac{{(SR(x,t) – SR_{i,min } )}}{{(SR_{i,max } – SR_{i,min } )}}*(FPAR_{max } – FPAR_{min } ) + FPAR_{min }$$
(15)
where the values of FPARmin and FPARmax are independent of vegetation type and are 0.001 and 0.95, respectively; SRi,max and SRi,min correspond to the 95% and 5% percentiles, respectively, of the ith NDVI. SR(x,t) is represented by the following formula:$$SR(x,t) = frac{1 + NDVI(x,t)}{{1 – NDVI(x,t)}}$$
(16)
A comparison of the estimated results of FPAR-NDVI and FPAR-SR shows that the FPAR estimated by NDVI is higher than the measured value, while the FPAR estimated by SR is lower than the measured value, but the error is less than that estimated directly by NDVI. As a result, these two values can be combined, and their weighted average value is taken as an estimate of the estimated FPAR, while ɑ means weight:$$FPAR(x,t) = alpha FPAR_{NDVI} + (1 – alpha )FPAR_{SR}$$
(17)
Light use efficiency (LUE) refers to the ratio of chemical energy contained in organic dry matter produced per unit area over a certain period of time to the photosynthetically active radiation absorbed by plants projected onto the same area at the same time. Different vegetation types and the same types of vegetation have different light energy utilization rates in different living environments43. The differences are mainly due to the characteristics of the vegetation itself, temperature, moisture, and soil44. Vegetation has the highest utilization rate of light energy under ideal conditions, but the maximum light energy utilization rate in the real environment is mainly affected by temperature and moisture, which can be expressed as follows:$$varepsilon left( {x,t} right) = T_{varepsilon 1} left( {x,t} right) cdot T_{varepsilon 2} left( {x,t} right) cdot W_{varepsilon } left( {x,t} right) cdot varepsilon_{max }$$
(18)
where Tε1(x,t) and Tε2(x,t) represent the stress effects of low temperature and high temperature on light energy utilization, respectively, Wε(x,t) is the effect of water stress on the maximum light energy utilization under ideal conditions, and εmax is the maximum light energy utilization under ideal conditions (gC * MJ−1). The maximum solar energy utilization rate εmax varies depending on the vegetation type. In this study, the maximum light energy utilization rate of different land use types simulated by an improved Carnegie-Ames-Stanford Approach (CASA) model is used as the input parameter of light energy utilization in the CASA model (Table 2). The monthly maximum light energy utilization rate is determined in three steps: first, calculate the APAR, temperature, and water stress factors of all pixels; then, select the NPP measured data of the same time period in the study area; finally, simulate the εmax of vegetation according to the principle of minimum error45. Figure 5 shows the calculation process of NPP. The weight of each habitat type in the HQ is shown in Table 3. The weight value is derived from the official document30. To facilitate the calculation, this paper normalizes the calculation results from 0 to 1 (Fig. 6a).Table 2 Maximum LUE rates of different land use types.Full size tableFigure 5NPP calculation process.Full size imageTable 3 Weight of each habitat type in the HQ.Full size tableVegetation coverage indexThe vegetation coverage index was obtained from the NDVI, which is a simple, effective, and empirical measure of surface vegetation status. The vegetation index mainly describes the difference between the reflection of vegetation in the visible and near-infrared bands and the soil background. This index also reduces the solar elevation angle and noise caused by the atmosphere and is thus the most widely used and effective calculation method. Each vegetation index can be used to quantitatively describe the growth of vegetation under certain conditions. The expression is as follows:$$NDVI = frac{NIR – R}{{NIR + R}}$$
(19)
where NIR and R are reflectance values in the near-infrared and red bands, respectively.NDVI values are obtained by processing the RS images of the Landsat 8 satellite. This satellite is equipped with an operational land imager (OLI) that includes nine bands with a spatial resolution of 30 m, including a 15-m panchromatic band. To facilitate the calculation, this paper normalizes the calculation results from 0 to 1 (Fig. 6b).Figure 6Eco-environment assessment indexes and evaluation rating map (the first quarter was used as an example). (a) Biological abundance index; (b) vegetation coverage index; (c) river density index; (d) land stress index; (e) pollution loading index; and (f) environmental status classification. The Figure is created using ArcGIS ver.10.3 (https://www.esri.com/).Full size imageRiver density indexThe river density index refers to the total length of rivers, lakes, and water resources in the assessed area as a percentage of the assessed area, which is used to reflect the abundance of water in the assessed area and is calculated as follows:$$begin{gathered} River ; density ; index = (A_{riv} *river ; length / area + A_{lak} *water ; area/area hfill \ + A_{res} {*}amount ; of ; resources/area , )/3 hfill \ end{gathered}$$
(20)
where Ariv is the normalization coefficient of river length, with a reference value of 84.3704, Alak is the normalization coefficient of the lake area, with a reference value of 591.7909, and Ares is the normalization coefficient of water resources, with a reference value of 86.387. Finally, the calculation results were normalized from 0 to 1 (Fig. 6c).Land stress indexThe land stress index is the degree to which the land quality in the assessment area is under stress. The weight of the land stress index evaluation is shown in Table 4.Table 4 Weight of the land stress index evaluation.Full size tableThe calculation method is as follows:$$begin{gathered} Land ; stress ; index = A_{ero} *(0.4*severe ; erosion ; area + 0.2*{text{mod}}erate ; erosion ; area hfill \ + 0.2*construction ; land ; area + 0.2*other ; land ; stress ; area)/area hfill \ end{gathered}$$
(21)
where Aero is the normalization coefficient of the land stress index, with a reference value of 236.0436. According to the “Classification criteria for soil erosion”46, the influencing factors of soil erosion, vegetation, soil texture, landform, and precipitation are ranked according to importance. In the calculation of the land stress index, all the land is divided into three categories, in which the weight of severe erosion is 0.4, the weight of non-erosion is 0, and the other erosion types such as moderate erosion and construction land are 0.2. The areas with severe erosion include vegetation coverage less than 30% and areas of soil erosion greater than 3.7 mm/a due to human activities. These areas are generally developed on highly erosive-sensitive soils. Cinnamon soil and loess soil in the study area are highly erosive-sensitive soils. Therefore, the industrial and mining areas of cinnamon and loess soil types are regarded as severe erosion areas. Areas with vegetation coverage greater than 50% are non-eroded areas, so water bodies and woodlands are divided into non-erodible areas. All areas except these two types have a weight of 0.2. Finally, the calculation results were normalized from 0 to 1 (Fig. 6d).Pollution loading indexThe pollution loading index refers to the load of pollutants in a certain area or an environmental element. In this study, the AQI was used to calculate the pollution loading index, and the results were normalized from 0 to 1 (Fig. 6e).The eco-environmental evaluation score was calculated based on the national environmental protection standard according to the weight of each indicator (Fig. 6f).Improved evaluation system and intelligent evaluation modelImproved evaluation systemConsidering that the evaluation factors in the national environmental protection standards are applicable to ordinary areas, areas affected by mines should have more evaluation factors than those in the standards. Thus, an improved evaluation system was proposed. The improved evaluation system has added factors that affect the environment of the mine based on the factors of the original system. The impact of the mine on the environment is reflected in the pollution of the atmosphere, such as dust from open pits and industrial waste from concentrators; the occupation of land by solid waste, such as ore piles and coal piles; soil pollution, such as the diffusion of heavy metals from coal piles, coal mine concentrator plants, and mines; and the increased likelihood of geological disasters, such as collapse caused by underground mining, spontaneous coal combustion and landslides caused by open-pit mining surfaces. Therefore, the improved evaluation system adds an air pollution range, a solid waste area, a geologic hazard range, and a metallic and non-metallic mine soil pollution buffer to the national environmental protection standards.The area of air pollution in mining areas is generally near open pits and concentrator plants. Therefore, the air pollution range was selected within 50 m around the open pits and the concentrator plants. Due to the dilution and dispersion of the air itself, an estimate of the pollution is the reciprocal of the wind speed (Fig. 7a).Figure 7New factors in the improved evaluation system. (a) Air pollution range; (b) solid waste area; (c) geological hazard range; (d1) non-metallic mine soil pollution buffer; and (d2) metal mine soil pollution buffer. The Figure is created using ArcGIS ver.10.3 (https://www.esri.com/).Full size imageMine solid waste pollution includes a large amount of waste rock from open-pit mining and pit mining, coal gangue produced by coal mining, tailings from beneficiation and slag from smelting. These solid wastes are generally piled up near the mining area. They not only occupy large areas of land and induce geological disasters such as landslides and mudslides but also cause chemical pollution, spontaneous combustion, and radiation from radioactive materials due to long-term stacking. This may affect the health and safety of humans and other biological organisms. The scope of the solid waste area is determined by the ore piles, coal piles, and dumping sites (Fig. 7b).Mine geological disasters are caused by a large number of mining wells and rock and soil deformation, as well as serious changes in the geological, hydrogeological, and natural environments of the mining area, endangering human life and property and destroying mining engineering equipment and mining resources. In this study, the geologic hazard range consists of areas with underground mining stopes and coal piles (Fig. 7c).After the pollutants generated by the mining operation enter the soil, physical and mechanical absorption, retention, colloidal physicochemical adsorption, chemical precipitation, bioabsorption, etc. of the suspended pollutants through the soil continue to accumulate in the upper soil. When pollutants reach a certain maximum, they cause deterioration of the soil composition, cycle, properties, and functions and begin to accumulate in plants, which affects the normal growth and development of plants, decreases crop yield and quality, and ultimately affects human health. Metallic and non-metallic minerals have different effects on soil pollution. The pollution of soil by non-metallic minerals mainly occurs in coal mines and coal piles, and the buffer zone is centred on the coal mines and coal piles. Coal production activities can cause heavy metals in coal piles to enter the soil and cause pollution. Due to different types of heavy metals, the range of soil contamination is different47. Combining the non-metallic mineral industrial squares and coal mine-based non-metallic minerals around the heavy metal soil pollution range, the buffers are graded at 30, 200, and 1000 m (Fig. 7d1)43. The metal mines in Gongyi are mainly aluminium ore and iron ore. Referencing the spread range of heavy metal pollution in the soil of aluminium ore and iron ore mines48,49,50,51, the buffers are graded at 50, 100, 300, and 500 m (Fig. 7d2).The four new elements in the improved evaluation system are normalized from 0 to 1 during the calculation.Intelligent evaluation modelArtificial neural networks, decision trees, and SVMs were calculated using IBM SPSS Modeler software to find an intelligent model suitable for environmental assessment of the mine in the study area. Then, several models with high evaluation accuracy were selected. The SVM, CART, and C5.0 models were chosen for further comparison. The sampling points were selected randomly; 700 sampling points were selected from the area away from the mining area; 100 sampling points were selected from the mining area after random sampling by mine type, and these points were used as training samples. Non-mining evaluation scores were based on the national environmental protection standard, while the mining area scores were based on field investigation. In the field investigation, preliminary scoring of the sampling points was conducted according to mining type, mining intensity, air quality, and surrounding environment. Then, a photo of the field was taken at every sampling point in the mine, and experts were invited to further score the area according to the photo. This score is the relative score obtained by referencing the national environmental protection standard.The index layers of the training samples were used as input, and the scores were used as the output to train the machine learning models. The trained models were applied to the entire study area, and all points except the training sample points were used for verification. After further comparison with SVM, CART, and C5.0, the evaluation accuracy rates of the three methods in the mining area and non-mining area were obtained. In the non-mining area, the model evaluation results of various land use types were compared with the national environmental protection standards. The accuracy in various land use types is shown in Table 5. In the mining area, the model evaluation score is compared with the score from the experts, and the obtained accuracy table is shown in Table 6.Table 5 Accuracy of each algorithm in various land use types in non-mining areas.Full size tableTable 6 Accuracy of each algorithm in the mining area.Full size tableIn non-mining areas, the accuracy of the SVM model is significantly better than that of the other two methods. However, in the mining area, the accuracy of the CART model is higher. Therefore, the SVM model was used to evaluate the area away from the mine, and the CART model was used to evaluate the mining area. The evaluation results of these two models were combined to obtain the evaluation map of the entire study area. More