Effects of climate and land-use changes on fish catches across lakes at a global scale

Choice of modeling approach

We chose to use a Bayesian networks modeling approach because it could minimize negative effects of data limitations on the statistical power of our analysis, by accounting for our understanding of multi-level, causal processes that could underlie changes in CATCHs in response to changes in climate and land use across lakes. A BNM could be developed to represent the hypothesized causal processes by expressing them as conditional probabilistic relationships between model variables, with which the estimation of every BNM coefficient took all model inputs derived from data into account²⁰. For example, if we developed a simple BNM to represent that changes in AT resulted in changes in WT, which, in turn, resulted in changes in CATCH in a lake, the estimation of model coefficients for the relationship between AT and WT and for the relationship between WT and CATCH would both take model inputs for all variables (AT, WT, and CATCH) into account. Although the 31 study lakes were selected partly because of good data availability, there were still large within-lake differences in data availability among model variables (as shown in Supplementary Data 3). For example, we obtained 9, 16, and at least 32 years of data for CHL, WT, and all the other variables in Lake Chilwa, respectively. Therefore, the BNM’s capability of taking all model inputs into account for the estimation of every model coefficient was important to maximize the statistical power of our analysis.

Model development

To mathematically formulate our BNM based on our hypothesized causal processes (Fig. 2), we assumed that (1) each response variable in the BNM has a normal distribution, with a mean that can be expressed as a function of predictor variables based on the primary literature described below, and (2) annual time-series data are independent observations, such that the mean value of each response variable in 1 year is dependent upon the values of predictor variables in the same year, but independent from all of its mean values in the other years. For example, the conditional probability distribution of WT in our BNM was expressed as

$$left{ begin{array}{l}{mathrm{WT}} _i ,{simmathrm{NORM}}left( {mu _i^{{mathrm{WT}}},sigma ^{{mathrm{WT}}}} right), mu _i^{mathrm{WT}} = beta _{0,i}^{{mathrm{WT}}} + beta _{1,i}^{{mathrm{WT}}} times {mathrm{AT}}_i{mathrm{,}}end{array} right.$$

(1)

where NORM(μ, σ) represents a normal distribution with a mean of μ and a standard deviation of σ, subscripted i indicates the ith lake, superscripted WT indicates that the parameter or BNM coefficient was associated with the response variable WT, and β represents a BNM coefficient. In Eq. (1), μ^WT was expressed as a linear function of AT based on the primary literature²¹ and was assumed to be dependent upon the observed value of AT in the same year, but independent from any μ^WT of the other years. In a similar manner, we give details for how we developed the conditional probability distributions of the other three BNM response variables—change in water level (ΔWL), CHL, and CATCH—in the following of this section.

The conditional probability distribution of ΔWL in our BNM was developed based on an empirical linear relationship between ΔWL and annual catchment runoff (RUNOFF)²³, of which the latter could be expressed as a balance between PRE and evapotranspiration (ET)⁵³. In a simple manner, these relationships could be expressed as

$$Delta {mathrm{WL}} = {it{alpha }}_0^{Delta {mathrm{WL}}} + alpha _1^{Delta {mathrm{WL}}} times {mathrm{RUNOFF}},$$

(2)

$${mathrm{RUNOFF}} = {mathrm{PRE}} – {it{alpha }}_1^{{mathrm{ET}}} times {mathrm{PE}} times {mathrm{LUag}} – {it{alpha }}_2^{{mathrm{ET}}} times {mathrm{PE}} times {mathrm{LUun}},$$

(3)

where α represents an empirical coefficient, PE is potential evaporation, LUun is undeveloped land use, and α₁^ET × PE × LUag and α₂^ET × PE × LUun represent ET over agricultural area and undeveloped area, respectively. In this study, we categorized catchment land use into three types: agricultural (which includes cropland, managed pasture, and rangeland), undeveloped (which includes forest, primary and secondary non-forest land, the lake itself, and other water and ice area), and urban. Equation (3) did not include urban land use because urban water use is mostly non-consumptive and returns to the catchment⁵⁴. By combining Eqs. (2) and (3) and re-arranging α coefficients, the conditional probability distribution of ΔWL in our BNM was expressed as

$$left{ begin{array}{l}Delta {mathrm{WL}}_i , {simmathrm{NORM}}left( {mu _i^{Delta {mathrm{WL}}},sigma ^{Delta {mathrm{WL}}}} right),hfill mu _i^{Delta {mathrm{WL}}} = beta _{0,i}^{Delta {mathrm{WL}}} + beta _{1,i}^{Delta {mathrm{WL}}} times {mathrm{PRE}}_i + beta _{2,i}^{Delta {mathrm{WL}}} times {mathrm{PE}}_i + beta _{3,i}^{Delta {mathrm{WL}}} times {mathrm{PE}}_i times {mathrm{LUag}}_i{mathrm{,}}end{array} right.$$

(4)

where each β coefficient can be expressed as a combination of α coefficients in Eqs. (2) and (3). We only included LUag in Eq. (4) because in each of these 31 study lakes, LUag + LUun in the catchment was almost a constant, which means that LUun could be replaced with a constant minus LUag, as the increase in agricultural area mostly resulted from the decrease in undeveloped area, or vice versa, in the study period.

The conditional probability distribution of CHL in our BNM was developed based on empirical log–log relationships between CHL and total phosphorus²⁴ and between CHL and WT²⁵. Because total phosphorus data were only available for a few of our study lakes, we modeled the total phosphorus in lakes as a function of PRE²⁶, LUag²⁶, and ΔWL²⁹. This led to a conditional probabilistic distribution for CHL expressed as

$$left{ begin{array}{l}log ({mathrm{CHL}}_i),{simmathrm{NORM}}left( {mu _i^{{mathrm{CHL}}},sigma ^{{mathrm{CHL}}}} right),hfill mu _i^{{mathrm{CHL}}} = beta _{0,i}^{{mathrm{CHL}}} + beta _{1,i}^{{mathrm{CHL}}} times log ({mathrm{PRE}}_i) + beta _{2,i}^{{mathrm{CHL}}} times log ({mathrm{LUag}}_i) + beta _{3,i}^{{mathrm{CHL}}} times mu _i^{Delta {mathrm{WL}}} + beta _{4,i}^{{mathrm{CHL}}} times mu _i^{{mathrm{WT}}},end{array} right.$$

(5)

where μ^WT and μ^ΔWL are from Eqs. (1) and (4), respectively.

Finally, the conditional probability distribution of CATCH was developed based on (1) a theoretical relationship between CATCH and fish biomass (B)³⁴, (2) a hypothesized empirical relationship between fishing catchability (Q) and EFF and B³⁴, and (3) a hypothesized empirical relationship between B and WT^27,28, ΔWL^29,30, and CHL^31,32 and number of fish stocked (ST)^35,36. In a simple manner, these relationships could be expressed as

$${mathrm{CATCH}} = Q times {mathrm{EFF}} times B Rightarrow log ({mathrm{CATCH}}) = log (Q) + log ({mathrm{EFF}}) + log (B),$$

(6)

$$log (Q) = {it{alpha }}_0^Q + {it{alpha }}_1^Q times log ({mathrm{EFF}}) + {it{alpha }}_2^Q times log (B),$$

(7)

$$log (B) = alpha _0^B + {it{alpha }}_1^B times {mathrm{WT}} + {it{alpha }}_2^B times Delta {mathrm{WL}} + {it{alpha }}_3^B times log ({mathrm{CHL}}) + {it{alpha }}_4^B times log ({mathrm{ST}}),$$

(8)

By combining Eqs. (6)–(8) and re-arranging α coefficients, we developed the conditional probabilistic distribution for CATCH expressed as

$$left{begin{array}{l}log ({mathrm{CATCH}}_i),{simmathrm{NORM}}left( {mu _i^{{mathrm{CATCH}}},sigma ^{{mathrm{CATCH}}}} right), mu _i^{{mathrm{CATCH}}} = beta _{0,i}^{{mathrm{CATCH}}} + beta _{1,i}^{{mathrm{CATCH}}} times mu _i^{{mathrm{WT}}} + beta _{2,i}^{{mathrm{CATCH}}} times mu _i^{Delta {mathrm{WL}}} + beta _{3,i}^{{mathrm{CATCH}}} times mu _i^{{mathrm{CHL}}} +, beta _{4,i}^{{mathrm{CATCH}}} times log ({mathrm{ST}}_i) + beta _{5,i}^{{mathrm{CATCH}}} times log ({mathrm{EFF}}_i),end{array} right.$$

(9)

where μ^WT, μ^ΔWL, and μ^CHL are from Eqs. (1), (4), and (5), respectively. As described in the Supplementary Methods, we did not estimate β_4,i^CATCH for 23 lakes where stocking effects on CATCHs were not sufficiently important in the study period.

The derivations and data sources of model inputs, including climate and land-use drivers (AT, PRE, and LUag), lake-environmental factors (WT, WL, and CHL), CATCH, ST, and EFF are given in the Supplementary Methods.

Prior distributions

We followed a previously used procedure²⁰ to derive prior distributions of BNM coefficients. For example, to derive the prior distribution of (beta _{i}^{{mathrm{WT}}}), we first expressed the function of (mu _{i}^{{mathrm{WT}}}) in Eq. (1) as 31 linear models (i.e., one for each study lake) in a form

$${mathrm{WT}}_i = beta _{0,i}^{{mathrm{WT}}} + beta _{1,i}^{{mathrm{WT}}} times {mathrm{AT}}_i + varepsilon _i^{{mathrm{WT}}},$$

(10)

where ε is the residual error. Then we derive the least-squares estimates for the coefficients of each of these 31 linear models based on model inputs derived from data for WT_i and AT_i in the same year. In theory, the least-squares estimates for the coefficients of each of these 31 linear models have a multivariate normal (MVN) distribution because they could be expressed as linear combinations of WT_i, which was assumed to have a normal distribution as shown in Eq. (1). Finally, we used the MVN distribution for the least-squares estimates of the linear model coefficients as the prior distribution of each of the 31 sets of β_i^WT, but the covariance matrix of the MVN distribution was set to be proportional to its un-scaled, least-squares estimates, with a proportional constant (i.e., the hyperparameter) having a noninformative (hyperprior) distribution²⁰. We repeated this procedure to derive prior distributions for β_i^ΔWL, β_i^CHL, and β_i^CATCH. The prior distributions of these BNM coefficients were always developed on a lake-by-lake basis because for each of the 31 study lakes, we were able to obtain data to derive model inputs for at least 9 years for all BNM variables (refer to Supplementary Data 3 for data sources and availability). Lastly, for nuisance parameters σ^WT, σ^ΔWL, σ^CHL, and σ^CATCH in Eqs. (1), (4), (5), and (9), respectively, we set each of them to have a noninformative prior distribution.

Coefficient estimation

The posterior distributions of our BNM coefficients were developed based on pooled information across the 31 study lakes. Our BNM was structured in a way to have the transferability of information across the lakes informed by data. Specifically, in the derivation of posterior distributions of our BNM coefficients, the relative importance of lake-specific information increases with lake-specific sample size and across-lake difference. If a lake was very different from all the other lakes, for example, the relative importance of its lake-specific information would be much higher than across-lake information in the derivation of posterior distributions of lake-specific BNM coefficients.

Computationally, we derived empirical posterior distributions for all BNM coefficients by using a Gibbs sampling method to run Markov chain Monte Carlo (MCMC) simulations⁵⁵. We ran three random-starting MCMC chains by implementing the program JAGS⁵⁶ and the package runjags⁵⁷ in R⁵⁸. After discarding 20,000 iterations for burn-in and adaptation, we considered that the convergence of three MCMC chains was achieved at 50,000 interactions as autocorrelations were close to 0 and Gelman and Rubin’s convergence diagnostics⁵⁹ were <1.005 for all coefficients. Our empirical posterior distribution of each BNM coefficient was based on 50,000 samples from each converged MCMC chain.

Effects between variable pairs

As shown in Fig. 2, there were 13 pairs of predictor and response variables in our BNM. For each of these 13 variable pairs, we categorized the effects of predictor variable on response variable as positive, negative, and mixed in each lake based on the posterior distribution of the associated BNM coefficient, which we used as a proxy of effect strength because it represents how much the response variable would change in response to one unit change in the predictor variable. For example, we categorized the effects of AT on WT in lake i based on the posterior distribution of β_1,i^WT, which represents how much of WT would increase in response to one unit change of AT. We categorized effects between a variable pair as positive and negative based on the 75% CI of the associated BNM coefficient. We categorized effects between a variable pair as mixed when both positive and negative 75% CIs of the associated BNM coefficients included 0, which, statistically, also means that its 50% CI included 0.

Changes in drivers associated with a 25% catch decrease

We implemented the BNM and used a Monte Carlo simulation method to empirically derive distributions of climate (AT and PRE) and land-use (LUag) drivers associated with a 25% decrease in CATCH from its 1970–2014 median (i.e., a 25% catch decrease) in each of the 31 study lakes. We used a 25% catch decrease as our simulation target because it is close to the maximum value of standard errors for the ratios of CATCH to its median, which ranged from 2.1% to 25.7% across the study lakes in the study period.

Monte Carlo simulation

Our Monte Carlo simulation was a three-step process. The first step was to randomly sample a set of BNM coefficients (i.e., β^WT, β^ΔWL, β^CHL, and β^CATCH) from empirical posterior distributions by implementing the package rv⁶⁰ in R. Then, we calculated predicted medians of WT, ΔWL, CHL, and CATCH (denoted as μ₀^WT, μ₀^ΔWL, μ₀^CHL, and μ₀^CATCH, respectively) using the set of BNM coefficients and the 1970–2014 medians of all BNM inputs.

The second step was to generate a set of BNM inputs from probability distributions based on lake-by-lake data by implementing random number generators in R. We assumed that each of AT, PRE, and LUag has a normal distribution. PE was derived empirically based on randomly sampled AT and the assumption that temperature change is uniform throughout the year (e.g., if randomly sampled AT was 1 °C higher than its 1970–2014 median, we assumed that AT in every month was 1 °C higher than its 1970–2014 median).

The third step was to calculate predicted values (i.e., μ^WT, μ^ΔWL, μ^CHL, and μ^CATCH) using randomly sampled BNM coefficients, randomly sampled values of one climate or land-use drivers, and 1970–2014 medians of the other climate and/or land-use variables not being evaluated. For example, to investigate effects of AT on CATCH, we calculated BNM predicted values using BNM coefficients randomly sampled from empirical posterior distributions, inputs of AT randomly sampled from normal distributions, inputs of PE derived based on random samples of AT, and 1970–2014 medians of PRE and LUag.

We kept the set of BNM coefficients, inputs, and predicted values that led to a ratio of predicted CATCH to predicted median CATCH, which was calculated as [exp(μ^CATCH − μ₀^CATCH)], between 0.74 and 0.76, and assumed they were associated with a 25% catch decrease. We repeated the Monte Carlo simulation process until we obtained 10,000 samples associated with a 25% catch decrease for each lake to derive empirical probability distributions.

Assessment of climate and land-use effects

We assessed how climate and land-use drivers (i.e., AT, PRE, and LUag) could lead to a 25% catch decrease by analyzing empirical probability distributions derived from samples obtained in our Monte Carlo simulations.

To assess effects of AT on CATCH in each of the 31 study lakes, we analyzed empirical probability distributions associated with a 25% catch decrease for changes in AT, predicted effects of AT on WT, WL, and CHL, and predicted effects of WT, WL, and CHL on CATCH. We used predicted changes in WT, WL, and CHL driven by AT to quantify effects of AT on WT, WL, and CHL, respectively. Following the methods used by the Intergovernmental Panel on Climate Change (IPCC)⁴⁷, we calculated predicted changes as an actual difference from the 1970–2014 medians for AT, WT, and WL and as a percent difference from the 1970–2014 median for CHL. We used predicted changes in CATCH driven by WT, WL, and CHL to quantify effects of WT, WL, and CHL on CATCH, respectively. The predicted changes in CATCH driven by WT, WL, and CHL were calculated as percent differences from the 1970–2014 medians based on Eq. (9). For example, we calculated predicted effects of WT on CATCH as

$$left{ {exp left[ {beta _1^{{mathrm{CATCH}}} times left( {mu ^{{mathrm{WT}}} – mu _0^{{mathrm{WT}}}} right)} right] – 1} right} times 100%.$$

(11)

Similar to how we summarized BNM coefficients, we also categorized relationships between AT and CATCH and between AT and each of WT, WL, and CHL as positive, negative, and mixed based on 75% CIs of their changes associated with a 25% catch decrease in each lake. Further, we considered effects of WT, WL, or CHL on CATCH to be important in each lake if its predicted effects could lead to a >9.1% catch decrease based on 75% CI. The cutoff value 9.1% was calculated as (1–0.75^1/3) × 100%, based on the assumption that WT, WL, and CHL contributed equally to a 25% catch decrease.

With two changes, we followed the same method used to assess AT effects on CATCH to assess effects of PRE and effects of LUag on CATCH. First, following the IPCC methods for calculating projected global distributions of changes in PRE and land use, we calculated predicted changes as a percent difference from the 1970–2014 median for PRE and as an actual difference from the 1970–2014 median for LUag. Second, we did not analyze changes in WT and effects of WT on CATCH in both assessments, because there was no hypothesized linkage between PRE and WT and between LUag and WT (Fig. 2). Consequently, since effects of WT on CATCH were not assessed, we considered effects of WL or CHL on CATCH to be important in each lake if its predicted effects could lead to a >13.4% catch decrease based on 75% CI. The cutoff value 13.4% was calculated as (1–0.75^1/2) × 100%, based on the assumption that WL and CHL contributed equally to a 25% catch decrease.

Characteristics of lakes where CATCHs are vulnerable

We define vulnerability of a lake as the extent to which the focal unit (e.g., CATCH) will be adversely affected by environmental changes⁶¹. In this study, we used the magnitude of change for each of AT, PRE, and LUag associated with a 25% catch decrease to index the vulnerability of a lake to a 25% catch decrease owing to climate and land-use changes. For example, we considered a hypothetical lake A to be more vulnerable than another hypothetical lake B if the magnitude of change for each of AT, WL, and CHL associated with a 25% catch decrease was smaller in lake A than lake B.

We investigated whether the vulnerability of a lake to a 25% catch decrease corresponded to any of the two socio-economic characteristics associated with the catchment: access to clean water and shoreline population density; and any of the two hydrogeomorphological characteristics of the lake: average depth and SDI. The access to clean water was indicated by the proportion of population using drinking-water and sanitation services in the catchment³⁷. The shoreline population density was the population density within 10 km of a lake’s shoreline. SDI is a measure of circularity of the lake surface²⁴, which is defined as

$${mathrm{SDI}} = 0.5 times {mathrm{SL}}/left( {pi times {mathrm{Alake}}} right)^{0.5},$$

(12)

where SL is shoreline length and Alake is lake area. A larger SDI indicates that the lake surface is less circular and may have a larger littoral area relative to lake area. The derivations and values of these socio-economic and hydrogeomorphological characteristics are given in the Supplementary Methods.

We conducted a correlation analysis to evaluate whether the relationship between each of the 4 socio-economic and hydrogeomorphological characteristics and the magnitude of change for each of AT, PRE, and LUag associated with a 25% catch decrease was significant at p = 0.05 level (t test, N = 31). To quantify the magnitude of change for each of AT, WL, and CHL associated with a 25% catch decrease, we used the predicted median of absolute change associated with a 25% catch decrease, which was calculated from empirical probability distributions derived from samples obtained in our Monte Carlo simulations.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Source: Ecology - nature.com