
Algorithm and model performance
Both XGBoost and random forests models are machine learning algorithms with better performance (higher R2 and smaller MSE value) than the linear regression model under the condition that data quality and sample size are the same. However, the models did not yield good results when the data size was small. The model performance improved with an increase in the data size. Based on the interpolation methods, data size could be extended using coordinates24,25,26,27,28,29,30 so that the performance of XGBoost can be improved. Besides data size, the model performance could be improved by adjusting important parameters, such as time series periodic analysis with multiple data samples. The optimal model is able to make density predictions for both surface and bottom fishery densities under certain abiotic factors.
There are differences between XGBoost and random forests models10,23; thus, the sensitivity scores calculated by them are also different, especially for surface NASC, which had different contributions of different factors. The importance of features is different between the algorithms. Different algorithms resulted in different importance scores. The quantitative comparison in the form of scores can only be made while using the same algorithm. Nevertheless, the contribution of each factor was calculated in a similar way by all algorithms, especially for the factors with high sensitivity scores. Besides XGBoost and random forests models, support vector machine (SVM) 31 and logistic regression32,33,34 are available for feature selection.
Contribution of the factors to surface and bottom NASC
It is supposed that NASC of different water layers is directly related to the factors of their own layer. For example, in the present study, surface NASC was related to ST and N2-10 m, which were first featured. Similarly, the bottom NASC was related to N2-20 m, which was the first level-related factor. However, special cases also existed. In the rank of the sensitivity of factors for surface NASC, certain surface factors, such as N2-0 m, N4-0 m, N3-10 m, SS (2 m above surface mixed layer), Si-10 m, N3-0 m, P-0 m, and N4-10 m, were less important than some bottom factors; all had smaller sensitivity scores than BT (2 m in the bottom cold water layer). In the rank of the sensitivity of factors for bottom NASC, BT and BS at 2 m of the surface mixed layer were less important than ST (2 m above the surface mixed layer). The possible reasons may be that the sensitivity of direct factors for water layers was smaller than that of other factors, such as food influenced by surface factors, or there may be no significant direct effects.
Sensitivity scores of geographical, static, and dynamic factors to the surface and bottom NASC
The sum sensitivity scores of geographical, static, and dynamic factors to surface NASC were 0.087, 0.691, and 0.221, respectively, and average values were 0.029, 0.033, and 0.013, respectively. The results indicated that there were significant differences among the abiotic factors of surface NASC, and the sensitivity scores of static factors were higher than that of the dynamic and geographical factors, while dynamic factors were the weakest. Moreover, it showed that surface fishery resource density was more directly and highly affected by static factors than by other factors.
For bottom NASC, the sum sensitivity scores of geographical, static, and dynamic factors were 0.078, 0.530, and 0.392, respectively, and average values were 0.026, 0.025, and 0.023, respectively. Similarly, for bottom NASC, the sum sensitivity scores of static factors were the highest; however, the average value was close to the other two. It showed that the bottom fishery resources density was influenced by multiple factors. However, the human factors, such as overfishing, were not considered, and therefore we are unsure of its effect on the bottom fishery resources density.
Important abiotic factors
We found that the factors had different contributions in different water layers. It could be the result of different compositions of fishery creatures. There could be some creatures in the quantity that were substantially affected by some factor or factors in the surface mixed layer, so that these factors would contribute highly to surface fishery density as the first level-related factors. Similarly, for the bottom cold water layer, it may have several creatures affected by different factors. Therefore, the bottom fishery resource density was the first level related factor for many species, which did not have significant factors influenced by multiple factors. There are many kinds of fisheries resources in the offshore of the Northern South China Sea, and the composition is complex. The majority of fishery creatures live in the bottom cold water layer.
Temperature is one of the major abiotic stress factors. ST above 2 m in the surface mixed layer, belonging to group A and level one, was the most important factor for both surface and bottom cold water layers. Moreover, it contributed the largest difference to fishery resources as compared with other factors. Sea surface temperature is one of the major factors influencing the surface layer. It has a direct impact on surface NASC, such as jellyfish that have a tendency for temperature and temperature difference35. However, it also had a great influence on the bottom NASC, probably because ST could indirectly affect the bottom cold water layer. For example, the temperature has an influence on fish parasites36 and fish community structure37. DT, belonging to the level two in group B, had an immense effect on bottom NASC, which was also one of the important dynamic factors in the first level, indicating that temperature change greatly influenced fish behavior38,39. However, the sensitivity and extent of the reaction to temperature variation differed with species and age40.
Nitrite is the intermediate oxidation state between ammonia and nitrate, and nitrite toxicity could affect fish. Nitrite is usually taken up across the gills along with chloride, which disturbs several physiological functions, including ion regulation, respiration, and cardiovascular, endocrine, and excretory processes41. There exists a large difference in nitrite toxicity among fishes based on multiple internal and external factors. Important factors include water quality (i.e., pH, temperature, and cation, anion, and oxygen concentrations), exposure time, species, size, age, and individual fish susceptibility42. N2-10 m, one of the important static factors for the surface mixed layer and belonging to level two in group A, directly affected surface NASC, which indicated that sea creatures are more sensitive to nitrite. N2-20 m was the first important feature in class B that had a direct impact on the bottom NASC, which belonged to one of the static characteristics of the near bottom. This also indicates that nitrite had a higher possibility of having a direct impact on marine life in the bottom layer. Besides, the factors related to nitrites, such as N3-d1020 and N4-10 m, only had also had some influence on the bottom cold water layer.
Water depth, belonging to group C, greatly influenced both surface and bottom NASC. The proportion of certain fish species increased with an increase in water depth. For example, the proportion of Cephalopods was relatively high within the range of 40 to 100 m, and the proportion of crustacean was higher within the range of 10–20 m43.
Salinity difference (DS), which belonged to group C and was one of the dynamic factors, immensely affected both surface and bottom NASC. Salinity varied slightly in the same period; therefore, SS (2 m above surface mixed layer) and BS (2 m above the bottom cold water layer) did not correlate with factors related to seasonal fish migration. However, DS still influenced the vertical distribution of both surface and bottom cold water layers.
In addition, P-d1020 and BT (2 m above the bottom cold water layer) had some effect on NASC. They may have an indirect effect on the distribution of fishery resources or a direct effect with a time lag, although there was no clear evidence of their significant sensitivity in this study.
On the contrary, there were certain factors with less influence on water layers, such as SS (2 m above surface mixed layer), BS (2 m above bottom cold water layer), P-0 m, N4-20 m, and N2-d020; however, it did not imply that they had no function. The spatial distribution and age structure of organisms vary within water layers, which could lead to differences in the sensitivity of factors for each layer. If the relationships between species and factors are certain, or the rank list of the sensitivity of factors could be acquired, then creatures and their proportion in different water layers could be estimated.
Fishery resource distribution and other factors
There are many different kinds of abiotic factors, and only a few of them were used in this study. The abiotic factors collected at the same sampling point are concurrent. In fact, time-lagged data of some abiotic factors are also very worth studying, such as chlorophyll. Chlorophyll is often considered having a 30-day accumulation period prior to being reflected in higher trophic levels through ocean food chains44,45. However, the food chain is affected by many factors, e.g., human interference and alien species. Therefore, time-lag studies may be more suitable to be carried out without human interference. Similarly, synchronous studies are often susceptible to external factors, such as strong changes in the weather, which could lead to a big change in the sensitivity ranking of important factors by affecting surface mixed layer46,47. These may be related to the diverse behavior of marine organisms in the face of changing living conditions.
In addition to abiotic factors, the distribution of fishery resources may be affected by other ecological factors (human factors, biotic factors), especially bottom fishery resources. There may be many human factors that can affect the distribution of marine fishery resources45, including fishing, breeding, wastewater discharge, etc. The human factors affecting the seabed fishery resources described in the study mainly refer to the overfishing with bottom trawl as the main fishing method. Overfishing also affects the structure of the food chain, with unpredictable effects on time lag. As for biotic factors, different species act as biotic factors for each other, and their mutual relations include predation, competition, and symbiosis48,49. Further, even within the same species, there are intraspecific relationships.
Vertical probability distribution characteristics of fishery resources, obtained by fisheries acoustics techniques, are different from traditional fishing (i.e., bottom trawls and fishing nets with LED lights), which is featured with two dimensions. Here, the third dimension was added, making the analysis for fishery resources probability distribution more comprehensive and showing the importance of fishery resources density distribution in different water layers better. Stratification research on fishery resource density improved the evaluation of fishery resources. It was more multidimensional as compared with traditional plane analysis (e.g., fishery resources assessment model, physical habitat simulation model).
Source: Ecology - nature.com