Novel robust time series analysis for long-term and short-term prediction

The data needed for estimating the SR relationship consist of spawning biomass (S) and recruitment (R) observed over time. A lognormal distribution is frequently used as the distribution of errors for SR relationships¹³. We therefore assume that the residuals from a regression model having (r=log (R)) as a response variable and the logarithm of the latent SR relationship as the mean will have a normal distribution. In addition, we assume that the latent SR relationship is likely to be contaminated by some outliers given that fish populations often suffer from nonnegligible contamination, such as sporadic strong cohorts⁵.

Figure 3

Parameter estimates of the density-independent parameter (a), density-dependent parameter (b), and autocorrelation ((rho)) for the simulation using the HS SR function with autocorrelation (true (rho = 0.8)) in the residuals.

Full size image

Figure 4

Application of the robust SR model to fish population data from Japan. (Top) Estimates of ((b-min (S))/(max (S)-min (S))) using the LS and RSR methods. (Bottom) Examples of fitted SR curves using the LS (black line) and RSR (red line) methods (left, walleye pollock in the Sea of Japan; right, round herring in the Tsushima warm current).

Full size image

A robust regression approach

Suppose that the logarithm of recruitment ((r_t = log (R_t), (t = 1, ldots , T))) has the following autocorrelated normal distribution,

$$begin{aligned} r_t = f(S_t|{varvec{theta }})+varepsilon _t, end{aligned}$$

(1)

where (varepsilon _t) is a scaled autoregressive error of order one, that is, (sqrt{lambda _t}(varepsilon _t-rho sqrt{lambda _{t-1}} varepsilon _{t-1})= e_t) with a gaussian noise (e_t) of mean zero and variance (sigma ^2), (S_t) is the spawning biomass, (f(S_t|{varvec{theta }})) is the logarithm of a density-dependent population growth model (spawner-recruitment (SR) curve), ({varvec{theta }}) is the parameter (vector) of the SR curve, (rho) is the autocorrelation, and (sigma ^2) is the base variance of the normal distribution. (lambda _t , (in (0,1])) is the weight for a datum in year t. Rearranging the equation for (varepsilon _t), we have (varepsilon _t sim N(rho sqrt{lambda _{t-1}} varepsilon _{t-1}, sigma ^2/lambda _t)) (Appendix A). We define (lambda _t) to be related to the magnitude of the residual (varepsilon _t),

$$begin{aligned} lambda _t = exp left( – phi varepsilon _t^2 right) , end{aligned}$$

where (phi , (>0)) is the parameter that adjusts the influence of outliers. Given that the base variance (sigma ^2) is divided by (lambda _t), the variance is inflated when the difference between the datum and the SR curve is large. The model is equivalent to the AR(1) model when (lambda _t equiv 1) (i.e., (phi =0)) for any t. (sqrt{lambda _t}) is interpreted as the probability of the datum being generated from an uncontaminated normal distribution. When changing the (phi) parameter with (rho =0), the shapes of the probability density function and its derivative are similar to the Tukey’s biweight (also called bisquare) function¹⁴, which is close to the gaussian function near zero but decays swiftly as the datum becomes farther from zero (Fig. 1).

By solving the equation at equilibrium, the mean deviance residual at (t=1) is zero and the variance at (t=1) is given by ({text{var}}({varepsilon_{1}} ){ = }{sigma ^{2}} {{/}}left[ {lambda _{1}} left( {1} – {rho ^{2}} {tilde{lambda }} right) right]), where ({tilde{lambda }}) is calculated by substituting the sample mean of (lambda _t), (tilde{lambda } = (1/T) sum _{t=1}^T lambda _t) (Appendix B). Incorporating the initial status, the log-likelihood function to be maximized is given by

$$begin{aligned} log (L) = sum _{t=1}^T log left( N(r_t|f(S_t|{varvec{theta }})+delta _t, nu _t sigma ^2 lambda _t^{-1}) right) , end{aligned}$$

(2)

where (delta _{t} = 0) and (nu _{t} = (1-rho ^2 tilde{lambda })^{-1}) if (t = 1), and (delta _{t} = rho sqrt{lambda _{t-1}} varepsilon _{t-1}) and (nu _{t} = 1) if (t > 1). Because (varepsilon _{t-1}) increases and (lambda _{t-1}) decreases when there is an outlier at (t-1), the multiplication of (rho) and (sqrt{lambda _{t-1}}) mitigates the influence of an extreme outlier on autocorrelation and contributes to the restoration of the original autocorrelation.

We need to estimate the parameters (sigma), (rho), and (phi) in addition to the SR relationship parameters ({varvec{theta }}). The parameter (phi) determines the mixing proportion of contamination and governs the predictive ability of the model. We use time series cross-validation¹⁵, which is also called retrospective forecasting¹⁶ (RF), to stably determine the value of (phi). First we delete the last datum. Then we use the SR relationship estimated from the data excluding the last datum to forecast recruitment and calculate its error assuming that the deleted recruitment for the last year is true. Next, we delete the two last data, forecast the second-to-last recruitment, and calculate the error assuming that the deleted second-to-last year’s recruitment is true. After the procedure is repeated on a rolling basis, the (phi) parameter having the smallest average error is finally selected. The optimum (phi) is determined by minimizing the following RF error:

$$begin{aligned} RF_R = exp left( frac{1}{P} sum _{t=1}^P log left[ left( r_{T-(t-1)} -hat{r}_{T-(t-1)}^{1:(T-t)} right) ^2 right] right) . end{aligned}$$

(3)

This is the geometric mean of predicted errors, which stabilizes the performance of retrospective forecasting. (r_{T-(t-1)}) is the logarithm of observed recruitment in year (T-(t-1)) and (hat{r}_{T-(t-1)}^{1:(T-t)}) is the predicted value estimated using the data from years 1 to (T-t), which is given by

$$begin{aligned} hat{r}_{T-(t-1)}^{1:(T-t)} = f(S_{T-(t-1)}|hat{varvec{theta }})+hat{rho } sqrt{hat{lambda }_{T-t}} hat{varepsilon }_{T-t}, end{aligned}$$

where (t = 1, ldots , P). We adopt (P=10) for stable estimation in this paper, though we commonly take 5 as the minimum P¹⁷.

All subsequent analyses are performed using R¹⁸ and its package TMB¹⁹ (Template Model Builder).

Simulation

We generate the simulated data ((left{ (R_t, S_t) ; t = 1, ldots , T right})) with some outliers and autocorrelated errors and test the performance of our robust SR (RSR) method in comparison with the LS and LAD methods. LAD was chosen because it is a typical robust method and is generally superior to the least median squares method used in Chen & Paloheimo (1995)¹¹. The average recruitment data are generated from the Hockey–Stick (HS) SR function¹², (f(S_t|{varvec{theta }}) = log left( a min (S_t, b) right)), where ({varvec{theta }} = (a, b) = (1.2, 500)). Stochastic normal errors are added to the log recruitment data with or without autocorrelation. When there is an autocorrelation in the residuals of log recruitment, the autocorrelation is set to (rho = 0.8). To examine the effect of outliers, we add the outliers that occur at the expected frequency of twice per 10 years ((p=0.2)) to the residuals of log recruitment. The patterns of outlier occurrence are threefold: evenly occurring positive and negative outliers ((q=0.5)), all positive outliers ((q=1.0)), and all negative outliers ((q=0.0)) (see Appendix C for the definition of q). We then have eight types of simulated data (no outliers, positive and negative outliers, all positive outliers, and all negative outliers for autocorrelation in the normal residual (rho = 0) and (rho =0.8), respectively). The simulations are replicated 1,000 times for each of the eight types. The length of each SR data time series (T) is set to 30 years which is typical for SR time series data^9,12. The performance of the methods is evaluated by two indicators that represent long-term and short-term predictive abilities ((hat{R}_0 – R_0)/R_0) and ((hat{R}_{T+1} – R_{T+1})/R_{T+1}), respectively, where the former is the asymptotic maximum recruitment ((R_0 = ab) for the HS SR function) and the latter is recruitment in the ensuing year (T+1), which is given by (R_{T+1} = exp (f(S_{T+1}|{varvec{theta }}) + rho omega _{T} + eta _{T+1})), where (omega _T) and (eta _{T+1}) are independent gaussian noises (Appendix C). Note that the true recruitment at (T+1) does not include any outliers. The mathematical details of the simulation are given in Appendix C. Autocorrelation is always estimated such that (rho) is set to zero when an estimate of (rho) is equal to or less than zero because a negative autocorrelation is usually impractical²⁰. The parameter (log (phi )) in RSR is chosen from the grid values from (-3.0) to 3.0 in increments of 0.5. The best (phi) is a minimizer of the RF error (RF_R) (Eq. 3).

For sensitivity tests, we conduct the following additional simulations: (S1) same as the above base case scenario (S0) except that (a = 1.8); (S2) same as S0 except that (p = 0.1) (the expected frequency of outliers is once every 10 years) in place of (p=0.2); (S3) same as S0 except that (p = 0.3) (the expected frequency of outliers is three times every 10 years) in place of (p=0.2); (S4) same as S0 except that (f(S_t|{varvec{theta }})) is the logarithm of the Beverton–Holt function; (S5) same as S0 except that (f(S_t|{varvec{theta }})) is the logarithm of the Ricker function; S6) same as S0 except for the spawner-abundance dependent p, in which the expected frequency of outliers is higher for lower spawner abundances than for higher spawner abundances.

Finally, we calculate biological reference points related to maximum sustainable yield (MSY), i.e., fishing rate at MSY ((F_{rm {msy}})) and spawning biomass at MSY ((S_{rm {msy}})), for each scenario and evaluate their relative biases. To calculate (F_{rm {msy}}) and (S_{rm {msy}}), we require additional information on survival and growth as well as an assumption about population dynamics. For simplicity, we use the delay-difference model as the population dynamics model⁵. The mathematical details are given in Appendix D.

Real data analysis

Ichinokawa, Okamura & Kurota (2017) fitted the SR curves to fish population data from Japan which comprise 26 SR datasets (Appendix E), demonstrating that some populations showed strong density dependence but others had weak or low density dependence. We fit the HS SR curves to the same 26 SR datasets used in Ichinokawa, Okamura & Kurota (2017). Because Ichinokawa, Okamura & Kurota (2017) used LS as the fitting method, we use LS and RSR to compare the density-independent parameter (log (hat{a})), standardized density-dependent parameter (( hat{b}-min (S) )/( max (S) – min (S) )), autocorrelation in the residuals (hat{rho }), and predictability (hat{RF}_R) in the HS SR curves.

Source: Ecology - nature.com

Novel robust time series analysis for long-term and short-term prediction

A robust regression approach

Simulation

Real data analysis

Aquatic reservoir of Vibrio cholerae in an African Great Lake assessed by large scale plankton sampling and ultrasensitive molecular methods

Recent CO2 levels promote increased production of the toxin parthenin in an invasive Parthenium hysterophorus biotype

ITALIAN LANGUAGE

ENGLISH LANGUAGE