Recent declines in salmon body size impact ecosystems and fisheries
Age-length (AL) datasets
Alaska Department of Fish & Game (ADF&G) monitors the number, body size, sex, and age of Alaska salmon harvested in a variety of fisheries and on their return breeding migration from the ocean to freshwater. Age and body length (AL) data have been collected on mature adults from commercial, subsistence, and sport harvests, escapement (spawning population) projects, and test fisheries since the early 1900’s. ADF&G data has historically been archived in regional offices; however, for this project we were able to compile all available data from across the state (Supplementary Figs. S7–S10) into a single dataset, representing over 14 million raw AL samples.
The majority of Alaska salmon fisheries target mature adults during their breeding migration into freshwater. Data from commercial harvests represent the largest proportion (57%) of measurements and are generally collected from marine waters and near river mouths. Although many Alaska salmon fishing districts are designed to operate as terminal fisheries, targeting fish destined for their river of origin, even terminal fisheries can intercept salmon returning to other Alaskan populations, and many other districts are non-terminal. Because most commercial salmon fisheries in Alaska catch a combination of fish from the target stock and intercepted fish returning to other populations, commercial samples often include a mix of fish from different populations within a river drainage and outside the drainage (e.g., Southeast Alaska troll fishery may be >80% non-local fish at times). Commercial samples from some fisheries targeting wild salmon could include a relatively low but unknown proportion of hatchery-origin salmon, which could not be excluded from our analyses without individual-level information on origin (hatchery or wild). Samples from escapement enumeration projects (sampling projects that count the number of mature adults that ‘escape’ the fishery and return to freshwater) make up the next highest proportion of AL measurements (33%). Escapement projects collect AL data from fish sampled in the freshwater environment, close to or on the spawning grounds, generally at counting towers, weirs, or fences. A variety of other sampling project types (test fishing, subsistence catch, sport catch) make up the remaining portion of these data, with no single project type representing more than 5% of the samples. ADF&G recorded the name of the sampling project, generally as the name of a given river (e.g., Fish Creek) or district (e.g., Togiak District), which we refer to as sampling locations. To ensure as much as possible that methods of data collection were consistent across locations and species, we excluded data collected from projects other than commercial harvest and escapement monitoring from statistical analyses.
Age and length (AL) measurements were collected by ADF&G personnel using standard methods56. Briefly, fish length is collected to the nearest millimeter using a measuring tape or a manual or electronic measuring board, depending on project and year. Fish age was most commonly estimated by ADF&G scientists reading growth annuli on scales57. For many AL measurements, specimen sex was also recorded, predominantly using external characteristics for sex determination. Sex determination with external characteristics in ocean-phase fish is frequently unreliable58. Because most of our data come from commercial harvests that occur in ocean-phase fish prior to the development of obvious external secondary sexual characteristics, we did not analyze the sexes separately. However, other studies examining length at age with reliable sex determination have shown similar trends in size and age for males and females33,59. As in Lewis et al.19, we assume our results reflect similar trends in male and female salmon.
To ensure data were of high quality, a number of quality assurance checks were established, and data failing those checks were excluded from analysis. These checks include ensuring that ages and lengths were within reasonable bounds for each species, that sample dates were reasonable, that data were not duplicated, and that data were all of the same length measurement type (mid-eye to fork of tail). Because mid-eye to fork length was by far the most commonly used length measurement type (85% of samples) within the data, and the vast majority of sample protocols use mid-eye to fork measurements, we assumed that observations where no length measurement type was reported (0.08% of samples) were mid-eye to fork. No other unique length measurement type accounts for more than 2% of samples. We also excluded any samples that measured fewer than ten fish for a given year/location combination. After these extensive checks, we were left with measurements on over 12.5 million individual salmon.
A wide variety of gear types were used to collect samples. The three most common gear types included gillnet, seine, and weir. Sampling methods within projects did not change systematically over time; however, for at least some projects, changes did occur, such as changes in gillnet mesh materials and sizes (for commercial harvest60) or sampling location within a watershed (for escapement projects). Some of these methodology changes are sporadically reflected in the data (e.g., mesh size), whereas others are not included and difficult to capture (e.g., weir location changes). Given the inconsistency in data and metadata associated with these fine-scale methodology changes, and the spatial and temporal scale of this dataset, changes in mesh size, gear type, or fine scale location changes (movement of a project within the same river system) were not included in our analyses.
Consistency in salmon size declines
To quantify the spatial and temporal extent of body size change, we estimated the average length of fish for each species in each sampling location and return year (the year when the fish was caught or sampled on its return migration to freshwater), which we interpret as putative biological populations (henceforth referred to as populations). For each population, we averaged these annual means to find the mean body length during a baseline period before 1990 and recent period after 2010. The pre-1990 period included all data collected before 1990, though relatively little data was available before 1980. Comparing data from two discrete time periods avoids potential edge effects that would be introduced in dividing a consecutive time series. Only populations for which we had data in both periods were included (100 sockeye, 34 Chinook, 32 chum, and 13 coho salmon populations). We established a criterion of at least 3 years of data for each population during each time period for inclusion in this analysis. Although somewhat arbitrary, we chose 1990 as the end of the early period to ensure a large number of populations had sufficient data to be included, while still being early enough to provide a meaningful baseline for comparison with current data. Because our goal was to investigate trends experienced by resource users in Alaska, we included data from some stocks that are known to capture salmon that originated from areas other than Alaska. For example, estimates for Chinook salmon from Southeast Alaska are likely influenced by the inclusion of troll-caught Chinook salmon, which are largely composed of salmon originating from British Columbia (B.C.) and the U.S. West Coast. For visualization, the results of this analysis were then scaled up to the level of the fisheries management areas established by ADF&G (Fig. 1).
To quantify and visualize continuous changes in body size across time, we fit general additive models (GAMs) to annual mean population body length for each species. To avoid convergence problems due to small sample sizes, data collected before 1975 were excluded from this analysis. In contrast to previous studies that assumed monotonic linear changes in size18,19, year was included as a nonlinear smoothed term because preliminary analyses suggested that the rate of length change varied through time. We included data from all populations for which observations from five or more years were available (276 sockeye salmon populations, 202 Chinook salmon populations, 183 chum salmon populations, 142 coho salmon populations). We knew a priori that salmon populations differ in average body size, so to preserve original units (mm) while controlling for variation in absolute body length among populations, we included two fixed factors: population and region. We assigned regions based on terrestrial biomes and the drainage areas of major watershed (shown numbered on Fig. 1, colored by ADF&G management region). Repeating these GAMs on escapement data alone provided equivalent results (Supplementary Fig. S11), which confirms that our results are not due to an artifact of sampling procedures through time.
To visualize changes in age structure and size-at-age, we fit very similar GAMs to age and length-at-age data. As above we included fixed effects for population and region, as well as a nonlinear year effect. Using the same dataset as the previously described GAMs, we used either mean freshwater age, mean saltwater age, or mean length-at-age as the response variable. For length-at-age, we separately fit GAMs for the four most common age classes in each species, except coho salmon, for which sufficient data was available for only three age classes.
To determine the extent to which patterns of body size change are consistent across space within a species, we re-fit these GAMs by replacing the main year effect by either a region-by-year or population-by-year interaction and compared model fit using AIC. These nonlinear interactions allow regions or populations to differ in their patterns of length change through time. These models are more data intensive than the previous GAMs, so we included data from all populations for which our time series consisted of any 20 or more years of data (123 sockeye salmon populations, 37 Chinook salmon populations, 38 chum salmon populations, 14 coho salmon populations).
Contributions of declining age versus growth
To partition the contribution of changes in population age structure versus size-at-age to changes in mean population length, we used the chain rule61. We used the discrete time analog of the chain rule
$${Delta}left( {xy} right) = y{Delta}x + x{Delta}y,$$
(1)
and assume that change in mean length is a function of changes in population age structure, p(a), and mean length-at-age, x(a). For each species and population, age structure in year t was calculated as the proportion of individuals in each age a. Mean length in year t is given by
$$x_t = {Sigma}_ap_tleft( a right)x_tleft( a right),$$
(2)
and the year-to-year change in length is given by
$${Delta}x_t = x_{left( {t + 1} right)} – x_t = {Sigma}_ap_tleft( a right)x_tleft( a right) + {Delta}p_tleft( a right)x_tleft( a right),$$
(3)
where
$$p_t(a) = 1/2left[ {p_{t + 1}(a) + p_t(a)} right],$$
(4)
and
$${Delta}p_t(a) = left[ {p_{t + 1}(a) – p_t(a)} right].$$
(5)
Solving these formulas year-to-year for each species in each population, we estimated the proportion of change in mean length due to changes in age structure and size-at-age. We included all populations for which we had five or more years of data (though change can only be estimated for consecutive years of data) and averaged the results across populations in each region.
Causes of age and size changes
To identify potential causes of change in salmon body size, we quantified associations with a variety of indices describing physical and biological conditions in Alaska’s freshwater and marine salmon habitats. Each candidate explanatory variable was selected based on existing biological hypotheses or inclusion in previous analyses of salmon size or population dynamics.
We considered several ocean climate indicators as potential causes of change in salmon size over time. Pacific Ocean conditions are often quantified using large-scale climate indices such as the Pacific Decadal Oscillation (PDO), El Niño Southern Oscillation (ENSO), and NPGO. These large-scale indices of ocean conditions, as proxies for climate and marine environment, have been shown to affect the survival and productivity of Pacific salmon in the North Pacific Ocean62,63. PDO, NPGO64, and MEI65,66 indices were all accessed and downloaded online (PDO, http://research.jisao.washington.edu/pdo/; NPGO, http://www.o3d.org/npgo/npgo.php, accessed 2018-02-07; MEI, https://www.esrl.noaa.gov/psd/enso/mei/, accessed 2018-02-08; MEIw, https://www.beringclimate.noaa.gov/, accessed 2018-02-08). In this analysis, winter means of NPGO and MEI were used in addition to an annual mean of MEI. Two ice cover metrics were also used to capture ocean climate conditions. Bering Sea ice cover and retreat were downloaded from https://www.beringclimate.noaa.gov/, originally derived from the National Snow and Ice Data Center data. Bering Sea ice cover index represents the winter anomaly, relative to 1981–2000 mean. Bering Sea ice retreat is an index representing number of days with ice cover after March 15.
Sea surface temperature (SST) was also explored as a potential cause of the changes in salmon size and age. SST has proven to be closely linked to salmon productivity. Mueter et al.67 found that regional-scale SST predicted survival rates better than large-scale climate indices such as the PDO. They concluded that survival rates were largely driven by environmental conditions at regional spatial scales. SST was extracted from the Extended Reconstructed Sea Surface Temperature (ERSST) version 468. To approximate SST values close to the river mouths which juvenile salmonids are most likely to experience after ocean entry, a double layer of the grid cells tracing the coastline of Alaska were extracted and the mean summer SST was calculated for each region.
Because in situ fluvial temperature measurements are sparse, both spatially and temporally, compared to the coverage of the AL dataset, air temperature was used as a proxy for temperature during the freshwater life stages. Air temperature data were extracted and sorted from remote-sensed satellite observations into multi-monthly regional means by season69.
Finally, we considered the potential for competition with other salmon to influence salmon size by including the abundances of several highly abundant salmon species as explanatory covariates. Using data compiled by Ruggerone and Irvine39, we evaluated the abundance of adult pink, chum, and sockeye salmon returning to Asia and North America as a proxy for the abundance of adult salmon of each species in the North Pacific. In addition, we also considered the more localized abundance of pink, chum, and sockeye salmon returning to Alaska, because salmon body size has been shown to vary with salmon abundance in the year of return migration in some species70 at finer spatial scales. The abundances of coho and Chinook salmon were not included, because they occur at much lower abundance than sockeye, chum, and pink salmon.
We also explored marine mammal abundances as potential predictor variables, but found that the data available precluded rigorous statistical comparison with our time series of salmon size and age structure. For example, the only estimates of orca abundance available for our study area (that from Southeast Alaska and Prince William Sound) show steady, near monotonic increases through our study period71,72. Statistically, this leads to insufficient replication and high collinearity with year effects. Although caution is warranted in interpretations of any models for which the assumptions are so obviously violated, we note that preliminary analyses including marine mammal abundance were not dramatically superior in terms of variance explained or model fit. Because of these limitations, we determined that a reliable test of the effect of marine mammal predation was not possible for Alaska.
Ultimately, we only selected covariates with an absolute correlation among covariate time series of less than 0.61. By establishing this threshold for absolute pairwise covariate correlation we sought to include only covariates for which separate associations with salmon size could be identified. The final set of covariates included in our analyses were: (1) ocean climate indicators (PDO, NPGO, MEI, winter MEI (MEIw), and Bering Sea ice cover index); (2) sea surface temperature (SST); (3) air temperature as proxy for freshwater temperature; and (4) ocean salmon abundance (abundance of Alaska sockeye, pink, and chum salmon, and North Pacific wide abundance of sockeye, pink, and chum salmon).
To test hypothesized associations between temporal trends in the average body size (length) of salmon and environmental conditions, we fit a series of Bayesian hierarchical models to data describing size trends across sampling locations for each species. Because the chain rule analysis showed that changes in age structure explained greater interannual body size variation than did changes in size-at-age, we analyzed age-aggregated mean body length. Time series, starting in 1975, of annual mean length by species for each sampling location (l) and environmental covariates were mean-variance (Z) standardized prior to model fitting. Models of the form
$$L_{i,t} = mathop {{Sigma}}limits_c ( {beta _{l,c} * X_{t – delta _{c,}c}} ) + sleft( t right) + varepsilon _{l,t},$$
(6)
were fit to each salmon species separately using Bayesian methods, where Ll,t is the standardized length at each location (l) in each return or observation year (t), βl,c are coefficients describing the effect of each covariate (c) on average length at each location, and (X_{t – delta _{c,}c}) is the standardized value of each covariate in each year. The reference year for each covariate is specified relative to the return year, or year in which salmon length compositions are observed (t), by a species and covariate-specific offset δc that associates covariate effects with the hypothesized period of interaction in each species’ life history (Supplementary Table S2). Location-specific covariate effects are structured hierarchically such that parameters describing the effect of each covariate on observed changes in average length were subject to a normally-distributed prior whose hyperparameters (group-level means and standard deviations for each covariate) were estimated directly from the data:
$$beta _{l,c} sim {mathrm{Normal}}left( {mu _c,tau _c ^{2}} right),$$
(7)
This hierarchical structure permitted us to quantify both the average (group-level) association between length observations at each sampling location (l) and hypothesized covariates (i.e., the hyperparameter μc), and the level of among-location variation in these effects (i.e., (τ_c^{2})). Prior distributions for model parameters were generally uninformative, with the exception of the prior on the group-level mean covariate effects (μc) which included a mild penalty toward zero,
$$mu _c sim {mathrm{Normal}}left( {0,1} right).$$
(8)
The prior distribution of the group-level (hyper) standard deviation of covariate effects was broad and truncated at zero,
$$tau _c sim {mathrm{Normal}}left( {0,10} right)left[ {0,} right],$$
(9)
allowing the model to freely estimate the appropriate level of among-location variability in covariate effects.
Observation error was assumed to be normally distributed εl,t ~ Normal(0, σε2), with a common observation error variance (σε2) estimated as a free parameter and subject to a broad prior distribution
$$sigma _varepsilon sim {mathrm{Normal}}left( {0,10} right)left[ {0,} right].$$
(10)
Each species-specific model also included a smoothed nonlinear year effect s(t) describing residual trends in length across time that were shared among sampling (observation) locations but were not explained by the covariates. The degree of nonlinearity for the univariate smooth s(t) quantifying the common residual trend in length is controlled by the variance term (σs) for the coefficients forming the spline73, for which a broad zero-truncated prior distribution was defined:
$$sigma _s sim {mathrm{Normal}}left( {0,10} right)left[ {0,} right].$$
(11)
Hierarchical Bayesian models describing the temporal trend in location-specific salmon length were fit using the brms package73,74 in R (R Core Team 2018), which generates posterior samples using the No U-Turn Sampler implemented in the Stan software platform75. Three independent chains were run for 20,000 iterations with a 50% burn-in and saving every tenth posterior sample, resulting in 3000 posterior samples. Convergence of all chains was diagnosed by ensuring potential scale reduction factors (R̂) for each parameter were More