Quantitative modeling of radioactive cesium concentrations in large omnivorous mammals after the Fukushima nuclear power plant accident

Data sets

Radioactivity measurement data for several species of wild game mammals and birds in Fukushima Prefecture from May 2011 to March 2018 were released to the public by the Fukushima Prefecture Government (https://emdb.jaea.go.jp/emdb/en/portals/1040501000/). We extracted the data for wild boar (Sus scrofa), 1404 samples, and Asian black bear (Ursus thibetanus), 422 samples. The resulting boar and bear data sets contained total radioactive cesium activity (¹³⁴Cs + ¹³⁷Cs isotopes) values (in Bq/kg) from animals captured at different times and locations within Fukushima Prefecture. The data were imported for analysis into R 4.0.3 software²¹.

We ln-transformed the cesium activity values to bring their distribution closer to normal, creating the variable LnCsTot. To facilitate regression analyses (described below), we removed instances of missing data and cesium levels below detection: 20 samples (1.4%) for boar and 15 samples (3.3%) for bears. The time when each sample was taken (labeled “Day of collection” in the Fukushima Prefecture Government data set) was converted to years since the Fukushima accident (since March 11, 2011), assuming that 1 year = 365.25 days. This time of sample collection in years was called variable T.

Since for each sample some time passed between sample collection and radioactivity measurement (labeled “Result found Date”, called T_r in our notation), we needed to correct the reported LnCsTot values for physical decay over this time, which was different for different samples. The procedure used to perform this correction is described in Supplementary methods. The data with corrected total cesium values (LnCs_c) are provided in Supplementary data (Supplementary_Dataset_File_Full).

Mathematical model

To describe the data on ln-transformed total radioactive cesium levels (LnSc_c) in each species as function of time after the accident (T), we developed the following simple mathematical model (Eqs. 1A, 1B):

$${LnCs}_{c}=X+Q-mu times {T}^{nu }+Atimes mathrm{sin}left[2times pi times left(T+Pright)right], $$

(1A)

$$X=mathrm{ln}left[mathrm{exp}left(LnCs{134}_{t{0}_{r}}right)times {2}^{-frac{T}{{Th}_{Cs134}}}+mathrm{exp}left(LnCs{137}_{t{0}_{r}}right)times {2}^{-frac{T}{{Th}_{Cs137}}}right]$$

(1B)

Here the term X represents the estimated average radioactive cesium level in the studied area, based on the intercepts (LnCs134_t0r for ¹³⁴Cs and LnCs137_t0r for ¹³⁷Cs, respectively) from robust regression discussed in Supplementary methods, and taking into account only physical decay for each isotope (with half-lives of Th_Cs134 for ¹³⁴Cs and Th_Cs134 for ¹³⁷Cs, respectively). The terms Q, µ, ν, A and P represent adjustable model parameters. Parameter Q represents the fitted relationship between radioactive cesium levels in the animal (Bq/kg), relative to the external environment (Bq/m²). Parameter µ represents the net rate of radioactive cesium reduction in animal tissues over time due to all processes except physical decay (e.g. decrease in bioavailability due to migration of cesium into deeper soil layers, human-mediated cleanup efforts, etc.). Parameter ν is a potential power dependence for these processes. By default, ν was set to ν = 1, but exploratory calculations using ν = 2 or treating ν as a freely adjustable parameter (≥ 0.1) were performed as well. Parameters A and P in the sine function represent a sinusoidal approximation for seasonal changes in radioactive cesium levels in animal tissues (e.g. due to seasonal variations in diet and life style), where A is the amplitude of the oscillations, P is the phase shift, and the period is set to 1 year. For simplicity, these parameters were assumed to be the same for both studied cesium isotopes. The descriptions of each parameter are also presented in Table 1.

Table 1 The meanings of all parameters used in our mathematical model (Eq. 1A, 1B) for radioactive cesium levels in wild boar (Sus scrofa) and Asian black bear (Ursus thibetanus).

Full size table

Model fitting approaches

Initially, we used nonlinear ordinary least squares (OLS) regression (nls R function) to fit the model (Eq. 1A, 1B) to the data. To find the global optimum fit, we repeated the fitting procedure 2000 times with slightly different random initial parameter values and recorded the solution with the smallest root mean squared error (RMSE). Diagnostics on this regression included checking of convergence criteria and analyses of residuals (by scatter plot and histogram, regressing residuals as function of T, visualizing the QQ plot, autocorrelation and partial autocorrelation functions with 95% confidence intervals, performing the Shapiro–Wilk normality test, and calculating skewness and kurtosis). For boar data, diagnostics revealed problems with convergence (both X-convergence and relative convergence) and non-normality of residuals: e.g. Shapiro–Wilk p-value = 1.476 × 10^–7, skewness = − 0.37, kurtosis = 3.50. For black bear data similar problems occurred with convergence, but residuals were closer to the normal distribution (perhaps due to smaller sample size): e.g. Shapiro–Wilk p-value = 0.0526, skewness = − 0.058, kurtosis = 2.45.

Due to these issues, we used robust nonlinear regression (nlrob R package) to reduce the effects of “outlier” data points. To find the global optimum, we repeated the fitting procedure 2000 times with slightly different random initial parameter values and selected the solution with the smallest absolute value of median residuals. The best-fit parameters for OLS and robust regressions were somewhat different for both boar and bear data. For boar data, the minimum robustness weight was 0.339 and the median was 0.762, and the corresponding values for black bear data were 0.557 and 0.821, respectively.

For each species, we compared the performances of model variants with different assumptions about parameter ν: (1) The default case with ν = 1, which represents an exponential rate of radioactive cesium decrease due to processes other than physical decay. (2) The case with ν = 2, which represents quadratic decay. (3) The case with ν being freely adjustable (≥ 0.1). The comparisons were based on Akaike information criterion (AIC)^22,23. The purpose of these calculations was to better assess the shape of the time course for non-physical factors involved in radioactive cesium level decline in animal tissues over time after the accident.

In addition to analyzing the full data set for each species, we also performed separate analyses on subsets of data from specific locations: from those districts of Fukushima Prefecture where the mean radioactive cesium levels in animal samples were the highest, and where a sufficiently large number of samples was present. For wild boar, the two selected districts for this subset analysis were Soso and Kenpoku (819 samples), and for black bear they were Kenpoku and Kenchu (163 samples).

To further assess the sensitivity of model results to geographical and temporal factors, we also constructed a separate subset of data for each species. This subset excluded data from the Aizu and Minamiaizu districts, which are separated by mountains from the Fukushima Daiichi Nuclear Power Plant, and excluded data collected ≤ 6 months after the accident. These restrictions were intended to determine model performance on data collected in a more geographically contiguous area after the initial abrupt changes in contamination levels were completed and the system entered the phase of more stable kinetics. The purpose of all these analyses was to assess whether the time course of radioactive cesium levels in the bodies of each species differed between locations with high contamination vs. those with lower contamination, and as function of time after the accident.

We were interested in quantifying not only the center of the distribution of radioactive cesium values in each species over time, but also in assessing the lower and upper tails of this distribution. For this purpose, we fitted the model (Eq. 1A, 1B) for each species using quantile regression (nlrq function in quantreg R package) for the median (50th percentile), and also for the 25th and 75th percentiles. Initial parameter estimates for the quantile regressions were taken from best-fit parameters from robust regression described above. The 25th and 75th percentiles were selected instead of more extreme values (e.g. 5th and 95th) because the latter resulted in poor quality fits due to limited amounts of data at the fringes of the distribution.

To assess the variability of model parameters by location in more detail, we used mixed effects modeling (nlme R package) on the data from each species. Since original OLS fits suggested substantial deviations of residuals from the normality assumption, we performed mixed effects modeling on data with some outlier data points removed. The OutlierDetection package in R removed 43 boar samples and 5 bear samples. These outliers are listed in the Supplementary_outlier_data_points file. The remaining samples were used for mixed effects model fitting, but model performance metrics like coefficient of determination (R²) and RMSE were assessed on the full data set (with outliers included) for each species.

Since the Fligner-Killeen test of homogeneity of variances by district generated low p-values for both species (4.6 × 10^–14 for boar and 0.018 for black bear), we allowed modelled variances to differ by district (using the weights option in nlme). We investigated several random effects structures for some or all model parameters, with randomness by district only, or by district and municipality within district. Model diagnostics were the same as for fixed effects OLS modeling described above, and also included boxplots of model residuals by district. The mixed effects model variants with different random effects structures were compared using the anova function in R, and also by assessing convergence criteria, normality of residuals, skewness, and kurtosis. Consequently, preferred mixed effects model variants were selected for the full data as well as for the subset of two districts with high radioactive cesium levels, separately for each species.

Model extrapolation from training to testing data

To investigate how the robust and quantile regression fits of our model could extrapolate beyond the time range that was used for model fitting, we split the data for each species into “training” (early times) and “testing” (later times) parts. The split was done based on time since the accident (T variable), so that approximately ½ of the samples were assigned to the training and testing sets, respectively. For wild boar data, the training set included times between 0.20 and 3.45 years after the accident, and the testing set included times between 3.45 and 7.03 years. For black bear data, the training set included times between 0.42 and 3.46 years after the accident, and the testing set included times between 3.46 and 6.87 years.

We also evaluated an alternative approach to splitting the data, where the split was done randomly instead of by time. In other words, any data point regardless of time had an equal probability of being assigned to either the training or the testing data set. Both the training and testing data subsets generated by this random split included the complete time range. This approach was implemented in context of the sensitivity analysis described above.

For each species, robust and quantile regressions were fitted to training data, and their predictions were calculated for testing data. For robust regression, RMSE was calculated on testing data for two scenarios: (1) for the model fitted to training data only, and (2) for the model fitted over the entire data range (training + testing). These RMSE values for conditions 1 and 2 were compared to assess the quality of model extrapolation. Extrapolation performance for robust and quantile regressions was also assessed visually by plotting the model predictions and data.

Application of the model to wild boar data from the Chernobyl accident area

To compare the results of our analysis of wild boar contamination with radioactive cesium in the area affected by the Fukushima accident with data from another location, we also analyzed wild boar data from the Chernobyl accident area. These data were published by Gulakov¹⁴ and contain summaries of ¹³⁷Cs contamination levels in the muscles of 188 boar collected between 1991 and 2008 (i.e. from 5 to 22 years after the 1986 accident). Sampling was carried out in three zones with different land contamination levels with ¹³⁷Cs. This data set provides important information on radioactive cesium contamination in wild boar in the Chernobyl area. Unfortunately, ¹³⁷Cs measurements in each sampled boar were not provided by Gulakov¹⁴, and only summary statistics are available for each zone and year after the accident (Tables 1–3 in reference¹⁴): number of animals, mean, minimum and maximum ¹³⁷Cs levels.

We could not apply the full model (Eq. 1A, 1B) to these summary data which lacked seasonality information and ¹³⁴Cs data. However, we were able to perform a weighted linear regression to quantify the ecological half-life of ¹³⁷Cs in Chernobyl boar and the relationship between ¹³⁷Cs levels in the animals (Bq/kg), relative to the external environment (Bq/m²). The data used for this analysis, derived from Gulakov¹⁴, are provided in Supplementary data (Supplementary_Dataset_File_Full). They contain the following variables. Zone = location of sample collection (Alienation, Permanent control or Periodic control). Time = time in years after the Chernobyl accident. LnMeanCs = ln-transformed mean ¹³⁷Cs level in boar muscle (Bq/kg). LnMeanCs_c = LnMeanCs − X, where X is ln-transformed ¹³⁷Cs land contamination (Bq/m²) in the given zone, corrected for physical decay of ¹³⁷Cs. Weight = weighting of each data point used for regression. Weight = number of animals/(ln[maximum ¹³⁷Cs level] − ln[minimum ¹³⁷Cs level])². These approximately inverse-variance weights were normalized by the overall mean across all data points, so that the mean weight across all data points was set to 1.

These data were analyzed by weighted linear regression in R, where LnMeanCs_c was allowed to depend on Time and Zone variables. Model variants containing all possible combinations and pairwise interactions between these predictor variables were fitted and their performances were compared using the Akaike information criterion with correction for small sample size (AICc). These calculations were performed using the glmulti R package. Multimodel inference (MMI) was performed on this collection of fitted model variants. It resulted in the calculation of model-averaged parameter estimates, 95% CIs and importance scores, corrected for model selection uncertainty. Of main interest here were the intercept parameter, which is analogous to parameter Q in the full model (Eq. 1A, 1B), and the Time parameter, which is analogous to parameter µ in the full model. The ecological half-life for ¹³⁷Cs was calculated based on the Time parameter.

Source: Ecology - nature.com

Quantitative modeling of radioactive cesium concentrations in large omnivorous mammals after the Fukushima nuclear power plant accident

Data sets

Mathematical model

Model fitting approaches

Model extrapolation from training to testing data

Application of the model to wild boar data from the Chernobyl accident area

Viruses infecting a warm water picoeukaryote shed light on spatial co-occurrence dynamics of marine viruses and their hosts

Substrate control of sulphur utilisation and microbial stoichiometry in soil: Results of 13C, 15N, 14C, and 35S quad labelling

ITALIAN LANGUAGE

ENGLISH LANGUAGE