in

Ecological dependencies make remote reef fish communities most vulnerable to coral loss

Fish distribution

We rasterized a detailed reef distribution vector map35 at 5 × 5 latitude/longitude degrees (by considering as reef area each cell in the raster intersecting a polygon in the original shapefile). We collected all the occurrences of fish species intersecting the rasterized reef area from both the Ocean Biogeographic Information System36 and the Global Biodiversity Information Facility37. We used taxonomic and biogeographical (i.e., latitudinal/longitudinal extremes for a given species) information from FishBase38 to exclude potential incorrect occurrences (i.e., all the records falling outside the known species ranges). We also restricted the list to all the species for which FishBase provided relevant ecological information (as these were needed to evaluate prey-predator species interactions and identify indirect links between fish species and coral, see below). The filtered list comprises 9143 fish species.

For these species, we used occurrence data to generate species ranges. For this, we used the α-hull procedure39, but instead of pre-selecting an α parameter and using it for all species, we developed a procedure to obtain conservative species ranges while including most of the known occurrences. First, we selected a very small α (0.001), to obtain a hull including most of the occurrences. Then, we progressively incremented α in small amounts (0.005) by computing, for each increment, the ratio between the relative reduction in the resulting hull area (in respect to the previous hull), and the relative reduction of occurrences included in the hull (in respect to the total number of available occurrences for the target species). We stopped increasing α when the ratio became <10. This procedure ensured that only isolated sites far from the core distribution of a species were excluded, while the range was stretched as much as possible around known occurrences.

After delineating ranges for each species, we rasterized the reef vector map at a higher resolution (1 × 1 latitude/longitude degree) and used it as a reference layer to extract fish occurrences at each reef location. This resolution is finer than that used by other global studies on reef fish diversity and distribution40,41. We took the 1° × 1° reef raster as the reference grid in all subsequent analyses and spatial interpolations, considering all the reef cells hosting at least five fish species (n = 1761).

Fish distribution validation

To validate the fish distribution data, we compared them with a smaller independent dataset (GASPAR) providing fish occurrences for 196 globally distributed reef localities42, which we rasterized against the same reference grid used for our fish and coral distribution data. Because this dataset is based on comprehensive check-lists, its information can be considered as ascertained presence-absence data. Thus, we compared our list of fish occurrences (at one degree) in each cell where data from the GASPAR dataset were also available, computing true skill statistics score as TSS = [(a × d) − (b × c)]/[(a + c) × (b + d)], with a being predicted & observed occurrences; b being predicted, but not observed occurrences; c being observed but not predicted occurrences; and d being not observed and not predicted occurrences. We obtained a median TSS of 0.53, with a median sensitivity (the proportion of correctly predicted presences) of 0.60, and a median specificity (the proportion of correctly predicted absences) of 0.96, indicating that our mapped ranges were sufficiently conservative and rarely generated false presences. Finally, given that we were analysing coral reef fishes, we excluded a few grid cells where our methods returned no fish species.

Environmental data

We obtained environmental data (surface temperature, salinity, pH, and total chlorophyll as a proxy for productivity) at a spatial resolution of 5 arcmin from Bio-ORACLE v2.043, and we upscaled these data on the reference reef grid (averaging the variable values in each 1 × 1 latitude/longitude degree grid).

Human impact

As a measure of human impact on reef localities, we used the 14 cumulative human impact layers (for 2013)19 available at https://doi.org/10.5063/F12B8WBS. For the purposes of our analysis, we categorized them into “local hazards” stemming from direct human impacts (specifically, six impact layers related to fishing activities plus light pollution, shipping, nutrient pollution, organic chemical pollution, and direct human interactions on coastal and near-coastal habitats, such as trampling); and “global hazards” related to planetary wide processes (warming, acidification and sea level rise). The original dataset has a resolution of 1 km2 and was therefore upscaled on the reference reef grid (averaging the variable values in each 1 × 1 latitude/longitude degree grid).

Time travel to cities

We quantified the “remoteness” of each reef locality in terms of travel time (based on the fastest possible local means of terrestrial and aquatic transportation, hence excluding air travel) to the closest human settlement. For this, we used the procedure described in Weiss et al.21 which consists of first combining information on land types and use, topography, distribution of roads and railways, position of national borders to derive a friction surface raster map indicating the average speed at which humans can travel through each pixel; and then applying an algorithm to identify the least costly paths (i.e. those requiring the shortest travel time) from each pixel to a target locality (e.g. a city)21.

The original publication21 provides a global map of accessibility that does not include water localities, which is clearly problematic for reefs. We therefore produced a new map of travel time (in hours) including also water pixels (at the same resolution of Weiss et al.21, i.e. 1 km2) by using their friction map, the same layer of human urban centre (the ‘high-density centres’ variant of the Global Human Settlements44) and the same cost distance algorithm (cumulative cost distance, which we computed using SAGA gis45). Then, we upscaled the high-resolution map on our grid of 1 × 1 degree reef localities (computing the mean accessibility per each 1 × 1 degree cell).

Bleaching susceptibility

We downloaded annual layers reporting maximum bleaching alert level at the global scale and at a resolution of 50 km from 1985 to 201946. Alert levels range from 0 (no stress) to 4 (mortality likely). We upscaled each layer on the reef reference grid (averaging alert level data) and computed an index of bleaching susceptibility as the average of recorded alert level in each coral reef pixel of the reference raster.

Building ecological networks of fish → fish interactions

We built networks of fish → fish interactions by using a multi-step procedure. (1) We generated a model capable of predicting the probability of occurrence of a prey-predator interaction between two given fish species based on some of their functional and ecological traits. For this, we obtained information on fish body size, trophic level, minimum and maximum depth, and habitat preference for 17,722 fish species from FishBase38, OBIS36 and GBIF37 (from the latter two sources, we specifically derived complementary data on species depth occurrences, which we used to fill in gaps in FishBase). We combined this information with a large dataset of known prey-predator interactions assembled from the Global Biotic Interactions dataset, GLOBI47. After filtering GLOBI according to the set of species with available ecological information and removing replicated records, we obtained 11,188 individual prey-predator pairs (for a total of 2643 species). We then identified an identical number of absences (pairs of species not interacting, and hence not having a link in the network). GLOBI includes only observed interactions, while it does not provide explicit information on non-interacting species. Although one can ideally generate a list of absences by sampling from all pairwise combinations of species not listed by GLOBI, this procedure might lead to the mislabelling of an actual prey-predator pair as a non-interacting pair simply because the species combination is missing from the database. To reduce this risk and generate “reliable” pseudo absences (that is, truly representative of associations not possible in the real world), we used a stochastic approach where we sampled species pairs at random from all possible species combinations not present in GLOBI with the important addition of two constraints; namely, the prey needed to be at least 30% larger than the predator and/or the predator needed to have a trophic level ≤3.0 (according to FishBase trophic classification).

(2) We then used a random forest classifier (a machine learning technique; we used the Python package Scikit-learn48) where the dependent variable was the presence or (pseudo) absence of interactions, and the independent variables were prey and predator traits (prey body size, prey trophic level, prey min and max depth and eight dummy variables for habitat; and the same variables for predator, for a total of 24 independent variables). We first explored the ability of the model by training it on a random subsample (50%) of the dataset (including true presences and pseudo absences), and then testing it on the remaining fraction. The model performed very well, being capable of predicting observed (true positives) and unobserved interactions (true negatives) in the testing set with an exceptional precision and accuracy (TSS = 0.93; type I error rate = 0.05; type II error rate = 0.02). After this first exploration, we used the full dataset to train the model to be used on the actual data. Out-of-bag validation score in the final model based on the complete dataset was >0.97.

The random forest predictor was used to assess the probability of trophic interaction between a large list of potential interactions generated by combining all fish species from our reef fish occurrence dataset known to rely mainly or exclusively on fish for their survival (i.e. “true piscivores”, FishBase trophic level > 3.5), with all the fish in the dataset. The full list included 31,768,450 potential interactions, that we reduced to 6,721,450 interactions by keeping only the interacting pairs identified by the random forest classifier with a probability ≥0.9.

(3) If the ecological dependency between two species is actually manifested then the two species must obviously co-occur at some locations, and vice-versa, co-occurrence is a necessary pre-requisite for an ecological dependency. Following this logic, we took a final, additional step to further filter and improve the fish → fish interaction list. In particular, we quantified the tendency for species to co-occur in the same locality as one potential proxy layer for species interactions, complementary to our other approaches. There are various factors that can affect the co-occurrence of two species. In a simplification, this can emerge from stochasticity, shared environmental requirements, shared evolutionary history, and ecological dependencies. We attempted to disentangle the effect of the last factor from the first three.

For each target species pair, we computed overlap in distribution as the raw number of reef localities where both target species were found. Then, we compared this number with the null expectation obtained by randomizing the distribution of species occurrences across reef localities. We designed a null model accounting for randomness, species niche and biogeographical history, and hence randomizing the occurrence of species only within areas where they could have possibly occurred according to environmental conditions and biogeographical factors (e.g., in the absence of hard or soft barriers). To implement the null model, we first excluded from the list of potential localities all the areas outside the biogeographical regions where the target species had been recorded, with regions identified according to Spalding et al.49. Then, within the remaining areas, we identified all the reef localities with climate envelopes favourable to target species survival. For this, we identified the min and max of major environmental drivers (mean annual surface temperature, salinity, pH) where the target species occurred, and then we identified all the localities with conditions not exceeding these limits. We generated, for each pairwise species comparison, one thousand randomized sets of species occurrences by rearranging randomly species occurrence within all suitable localities. We quantified co-occurrence between the species pair in each random scenario. Finally, we compared the observed co-occurrence with the random co-occurrences, computing a p-value as the fraction of null models with co-occurrence identical or higher than the observed one. We kept only the pairs with a p-value < 0.05. This further reduced the fish → fish list to 1,365,863 interactions. We used the networks to build site-specific networks interactions in all 1° × 1° reef localities of our reference grid, by filtering it according to local fish species diversity.

Measuring fish-coral dependency

We compiled from literature22,23,24,25 a list of fish species known to be associated with corals, in terms of habitat and/or trophic specialization. This list includes 44% of the fish species we used in our analysis (4040/9,143). As above, we used the known associations (or lack thereof) in the dataset to identify coral dependency in the unassessed fish. For this, we trained two independent random forest classifiers (again using the Python package Scikit-learn48), one to model generic habitat associations, and the other one to model corallivory. In both models, the dependent variable was the presence/absence of coral-association, and the independent variables were the same ecological features used to predict fish → fish trophic interactions (i.e. prey body size, prey trophic level, prey min and max depth and eight dummy variables for habitat), plus an additional variable quantifying the fraction of documented coral-associated species in the family of the target fish. Both models showed high precision and accuracy (with a TSS of 0.57 for the habitat association model, and of 0.81 for the corallivory model). Combining the list of coral dependent species from literature (n = 897) with our model predictions (n = 356) yielded a total of 1253 fish species.

We linked all the coral-dependent species in the local fish → fish networks to a symbolic “coral” node. Then, we quantified the overall dependency of fish assemblages on corals in each reef locality as the fraction of fish having at least one (unidirectional) path to corals across network links. We opted for this simple and intuitive measure after finding it produced virtually identical results to several, more complex, measures of fish-coral dependency that we explored (such as weighted and unweighted network distance between individual fish species and coral genera, and dependency values estimated using co-extinction simulations50). For each network, we also quantified, separately, the fraction of fish species directly associated with corals (i.e., having a minimum distance to corals in the network of one link) and indirectly associated with corals (i.e. having a minimum distance to corals of more than one link).

Risk assessment framework

Following the definitions from the IPCC’s fifth assessment report, we separate vulnerability (combination of sensitivity and adaptive capacity) from exposure to an extrinsic forcing agent (‘hazard’). Then we quantify risk as the combination of vulnerability, exposure, and hazard5.

Assuming, for illustrative purposes, a combined linear effect of local and global hazards on the risk experienced by a target system, we can model the latter (R) as:

$$R=Etimes ({{{H}}}_{{{{{{rm{local}}}}}}}times {{{V}}}_{{{{{{rm{local}}}}}}}+{{{H}}}_{{{{{{rm{global}}}}}}}times {{{V}}}_{{{{{{rm{global}}}}}}}),$$

(1)

with E being exposure, and Hlocal, Hglobal, Vlocal and Vglobal being local and global hazards and their respective vulnerabilities. If we then focus on average per-species risk, and assume no relationship between a system’s remoteness and its intrinsic vulnerability to local and global hazards, we can further simplify the equation by setting E, Vlocal and Vglobal to 1:

$$R={{{H}}}_{{{{{{rm{local}}}}}}}+{{{H}}}_{{{{{{rm{global}}}}}}}$$

(2)

To account for the effect of the expected increase in ecological dependencies with remoteness8 in the illustrative risk assessment model described by Eq. (2), we can add one term to quantify the combined effect of the vulnerabilities emerging from ecological dependencies combined with the exposure to relevant hazards capable of exploiting such vulnerabilities and triggering cascading effects through interaction links (“triggers”):

$$R=[alpha ({{{H}}}_{{{{{{rm{local}}}}}}}+{{{H}}}_{{{{{{rm{global}}}}}}})+beta ({{{{{rm{ecological}}}}}},{{{{{rm{dependency}}}}}}times {{{{{rm{triggers}}}}}})]/2$$

(3)

Here, α and β are weights that can be used to modulate the relative importance of the two risk components (impacts from humans and global change vs ecological dependencies). Assuming that both risk components are rescaled in [0,1], to keep R in [0,1], we need to set 0 ≤ α ≤ 2 and β = 2 − α.

Applying the risk assessment framework to reef fish communities

We modelled the local risk of a reef fish community (in each 1° × 1° grid cells in the reef reference raster) using two different approaches. First, we quantified the risk as originating from the sum of local and global hazards (Eq. (2)), where local and global hazards refer to the human impact layers19, as described in the “Human impact” section above. Then, we re-assessed risk for each reef fish assemblage when accounting also for the risk component possibly deriving from ecological (fish-coral) dependencies combined with a relevant hazard (e.g., death of coral species due to bleaching) capable of triggering cascading effects across species interaction links by adapting Eq. (3):

$$R=[alpha ({{{H}}}_{{{{{{rm{local}}}}}}}+{{{H}}}_{{{{{{rm{global}}}}}}})+beta ({{{{{rm{coral}}}}}},{{{{{rm{dependency}}}}}}times {{{{{rm{coral}}}}}},{{{{{rm{bleaching}}}}}},{{{{{rm{susceptibility}}}}}})]/2$$

(4)

Fish-coral dependency and coral bleaching susceptibility were assessed as described in the sections above. To make the different components of risk comparable, prior to computing risk, we rescaled both local + global hazards and fish-coral dependency × coral bleaching susceptibility between 0 and 1 across all reef localities. We did the same for the two sets of risk assessment values obtained using either Eqs. (1) or (2) (to permit direct comparison between the shapes of the risk-remoteness relationships).

Both equations ideally provide the average risk of a species in a given locality, that is they assume exposure = 1. Also, they assume that the average local degree of vulnerability towards either local or global hazard is constant among localities; therefore, the respective vulnerability terms can be removed from the risk equations given that they are constants which would affect each locality the same. See the “Potential caveats in the risk assessment equations” section below for additional discussion on these issues.

Assumptions of the risk assessment equations

In this study we demonstrated how the framework of environmental risk assessment could incorporate species dependencies to more thoroughly examine the relationship between risk and remoteness. The proposed risk assessment equations are not intended to provide a definitive global risk assessment of reef fish assemblages. Instead, they are functional to assessing if, and to what degree, the risk component stemming from ecological dependencies can affect the expected relationship between risk and remoteness. The exact form of the equations is not overly important. In our equations we assumed constant vulnerability of fish assemblages to local and global hazards. That is, we ignored hazard-specific vulnerabilities. Although fish on coral reefs are likely vulnerable to the various hazards to different extents, modelling this amount of complexity would be extremely difficult. Considering the multiplicity of hazards per locality, and their potential complex interactions, it would be extremely challenging to obtain precise and realistic values for each of them to test our assumptions. However, we were able to compile several proxies of potential vulnerability to some of the main hazards, and in particular we computed the average vulnerability to fishing for all fish species in each reef locality, using the vulnerability measure provided by FishBase and based on the method by Cheung et al.51. Based on geographic distributions of the species, we determined the temperature, pH, and organic matter limits for each species, and then we used these data as indicators of each species potential tolerance to changes in temperature, acidification and organic pollution. Based on species habitat preference as defined by FishBase, we determined the fraction of demersal, benthopelagic, and coral associated species, as likely more affected by direct human disturbances (such as trampling); and the proportion of pelagic fish as potentially affected by shipping. We then compared those vulnerability proxies with remoteness, finding no strong relationships which would need to be incorporated into the risk equations (Supplementary Fig. 3).

Then, we explored if our results held when exposure was taken into account (i.e., projecting the average per-species risk to the full fish assemblages). Exposure is a typical parameter involved in environmental risk assessment. For this, we multiplied the risk for the (loge-transformed) corresponding fish diversity. The observed patterns (Supplementary Fig. 4) were consistent with those relative to average species risk, which means that our conclusions scale up to fish assemblages. Again, the results of our study do not provide absolute estimates of risk for any of the fish species or coral reefs. However, with further research, we believe such estimates could be realistically obtained given sufficient species-specific data and more information about how the detrimental effects of each hazard are manifested.

Sensitivity analyses

We performed various analyses to check the robustness of our results and conclusions against potential biases stemming from data availability. In particular, we focused on potential relationships between the quality and quantity of information on species ecology and distribution, and remoteness. First, we checked for unequal distribution of sampling effort, under the hypothesis that remote localities could be less investigated than those close to human settlements. A comparison between the number of fish records available from OBIS36 and GBIF37 vs remoteness across all 1° × 1° reef localities revealed that this is not the case, with sampling effort remaining relatively high across all localities regardless of remoteness (Supplementary Fig. 2a, R2 = 0.0008).

We then explored whether the availability and quality of the ecological information we used in our analyses decreased with remoteness. For this we evaluated how the TSS values obtained from the comparison between the species ranges devised with our procedure and independent species distribution data from the GASPAR dataset42 varied across reef localities with remoteness. We found no relationship (Supplementary Fig. 2b, R2 = 0.0292). We also looked at the individual species TSS values obtained by comparing the distribution of a target species devised by our procedure with that according to the GASPAR dataset. Consistently with the previous result, we found no pattern linking the average of local species’ TSS values to remoteness (Supplementary Fig. 2c, R2 = 0.0001). We also explored whether remoteness affected negatively the fraction of species (for which we had distributional data) to be discarded in each locality due to the lack of the ecological information needed in our analyses. Again, the analysis revealed no effect of remoteness on data availability (Supplementary Fig. 2d, R2 = 0.0992).

Another potential question arising from our conclusions is whether they would still be valid when species abundances are considered alongside species diversity. To explore this issue, we tested whether the relative abundance of coral-dependent fish changes with remoteness using all the data available from the Reef Life Survey (RLS) dataset52. Finding that coral dependent fish become less abundant as remoteness increases would weaken our results, as the increasing species-level vulnerability stemming from coral dependency would be counterbalanced by the reduction in the overall number of individuals threatened by coral loss. This is not the case. On average, coral associated fish are more abundant than the other species (with an average number of individuals per survey of 782 for coral associated species vs 658 for non associated species). More importantly, the local proportion of associated individuals is unaffected by remoteness (Supplementary Fig. 2e, R2 = 0.0002).

Finally, we tested whether our results could be driven or confounded by a potential relationship between functional redundancy and remoteness. We quantified functional redundancy in each locality as one minus the ratio between the number of unique functional entities and total species richness. We identified functional entities using the method and functional diversity datasets as in Mouillot et al.53. We found no relationship between functional redundancy and remoteness (Supplementary Fig. 2f, R2 = 0.0042).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.


Source: Ecology - nature.com

Q&A: Can the world change course on climate?

The global loss of floristic uniqueness