in

Honey bee colony loss linked to parasites, pesticides and extreme weather across the United States

Honey bee colony loss and parasites across space and time

Honey bee colony loss strongly depends on spatio-temporal factors33,42, which in turn have to be jointly modeled with other stressors. Focusing on CONUS climatic regions, defined by the National Centers for Environmental Information40 (see Fig. 1), this is supported by the box plots in Fig. 2 which depict appropriately normalized honey bee colony loss (upper panel) and presence of V. destructor (lower panel) quarterly between 2015 and 2021. Specifically, Fig. 2a highlights that the first quarter generally accounts for a higher and more variable proportion of losses. Average losses are typically lower and less dispersed during the second quarter, and then tend to increase again during the third and fourth quarters. The Central region, which reports the highest median losses during the first quarter (larger than 20%) exemplifies this pattern, which is in line with existing studies that link overwintering with honey bee colony loss6,29,30,31,32,33,43. On the other hand, the West North Central region follows a different pattern, where losses are typically lower during the first quarter and peak during the third. This holds, albeit less markedly, also for Northwest and Southwest regions. These differing patterns are also depicted in Fig. 3, which shows the time series of normalized colony loss for each state belonging to Central and West North Central regions – with the smoothed conditional means highlighted in black and red, respectively. Figure 2b shows that also the presence of V. destructor tends to follow a specific pattern; in most regions it increases from the first to the third quarter, and then it decreases in the fourth – with the exception of the Southwest region, where it keeps increasing. This is most likely because most beekeepers try to get V. destructor levels low by fall, so that colonies are as healthy as possible going into winter, and also because of the population dynamics of V. destructor alongside honey bee colonies – i.e., their presence typically increases as the colony grows and has more brood cycles, since this parasite develops inside honey bee brood cells44,45. The West region (which encompasses only California since Nevada was missing in the honey bee dataset; see Data) reports high levels of V. destructor throughout the year, with very small variability. A comparison of Fig. 2a and b shows that honey bee colony loss and the presence of V. destructor tend to be higher than the corresponding medians during the third quarter, suggesting a positive association. This is further confirmed in Fig. 4, which shows a scatter plot of normalized colony loss against V. destructor presence, documenting a positive association in all quarters. Although with the data at hand we are not able to capture honey bee movement across states, as well as intra-quarter losses and honey production, these preliminary findings can be useful to support commercial beekeeper strategies and require further investigation.

Figure 2

Empirical distribution of honey bee (Apis mellifera) colony loss (a) and Varroa destructor presence (b) across quarters (the first one being January-March) and climatic regions; red dashed lines indicate the overall medians. (a) Box plots of normalized colony loss (number of lost colonies over the maximum number of colonies) for each quarter of 2015–2021 and each climatic region. At the contiguous United States level, this follows a stable pattern across the years, with higher and more variable losses during the first quarter (see Supplementary Figs. S2-S6), but some regions do depart from this pattern (e.g., West North Central). (b) Box plots of normalized V. destructor presence (number of colonies affected by V. destructor over the maximum number of colonies) for each quarter of 2015–2021 and each climatic region. The maximum number of colonies is defined as the number of colonies at the beginning of a quarter, plus all colonies moved into that region during the same quarter.

Full size image
Figure 3

Comparison of normalized honey bee (Apis mellifera) colony loss (number of lost colonies over the maximum number of colonies) between Central and West North Central climatic regions for each quarter of 2015–2021 (the first quarter being January-March). (a) Trajectory of each state belonging to Central (yellow) and West North Central (blue) climatic regions. (b) Smoothed conditional means for each of the two sets of curves based on a locally weighted running line smoother where the width of the sliding window is equal to 0.2 and corresponding standard error bands are based on a 0.95 confidence level46.

Full size image
Figure 4

Scatter plot of normalized honey bee (Apis mellifera) colony loss (number of lost colonies over the maximum number of colonies) against normalized Varroa destructor presence (number of colonies affected by V. destructor over the maximum number of colonies) for each state and each quarter of 2015–2021 (the first quarter being January-March). Points are color-coded by quarter, and ordinary least squares fits (with corresponding standard error bands based on a 0.95 confidence level) computed by quarter are superimposed to visualize the positive association.

Full size image

Up-scaling weather data

The data sets available to us for weather related variables had a much finer spatio-temporal resolution (daily and on a (4 times 4) kilometer grid) than the colony loss data (quarterly and at the state level). Therefore, we aggregated the former to match the latter. For similar data up-scaling tasks, sums or means are commonly employed to summarize the variables available at finer resolution47. The problem with aggregating data in such a manner is that one only preserves information on the “center” of the distributions – thus losing a potentially considerable amount of information. To retain richer weather related information in our study, we considered additional summaries capturing more complex characteristics, e.g., the tails of the distributions or their entropy, to ascertain whether they may help in predicting honey bee colony loss. Within each state and quarter we therefore computed, in addition to means, indexes such as standard deviation, skewness, kurtosis, (L_2)-norm (or energy), entropy and tail indexes48. This was done for minimum and maximum temperatures, as well as precipitation data (see Data processing for details).

Next, as a first way to validate the proposed weather data up-scaling approach, we performed a likelihood ratio test between nested models. Specifically, we considered a linear regression for colony loss (see Statistical model) and compared an ordinary least squares fit comprising all the computed indexes as predictors (the full model) against one comprising only means and standard deviations (the reduced model). The test showed that the use of additional indexes provides a statistically significant improvement in the fit (p(text {value}=0.03)). This test, which can be replicated for other choices of models and estimation methods (see Supplementary Table S5), supports the use of our up-scaling approach.

Figure 5 provides a spatial representation of (normalized) honey bee colony losses and of three indexes relative to the minimum temperature distribution; namely, mean, kurtosis and skewness (these all turn out to be relevant predictors based on subsequent analyses; see Table 1). For each of the four quantities, the maps are color-coded by state based on the median of first quarter values over the period 2015-2021 (first quarters typically have the highest losses, but similar patterns can be observed for other quarters; see Supplementary Figs. S12-S14). Notably, the indexes capture characteristics of the within-state distributions of minimum temperatures that do vary geographically. For example, considering minimum temperature, skewness is an index that (broadly speaking) provides information on whether the data tends to accumulate at one end or the other of the observed range of minimum temperatures (i.e., a positive/negative skewness indicates that the data accumulates towards the lower/upper range, respectively). On the other hand, kurtosis is an index that captures the presence of “extreme” values in the tails of the data (i.e., a low/high value of kurtosis indicates that the tail minimum temperatures are relatively close/very far from the typical minimum temperatures). With this in mind, going back to Fig. 5, we can see that minimum temperatures in states in the north-west present large kurtosis (a prevalence of extreme values in the tails) and negative skewness (a tendency to accumulate towards the upper values of the minimum temperature range), while the opposite is true for states in the south-east. More generally, the mean minimum temperature separates northern vs southern states, kurtosis is higher for states located in the central band of the CONUS, and skewness separates western vs eastern states.

We further note that the states with lower losses during the first quarter (e.g., Montana and Wyoming) do not report extreme values in any of the considered indexes. Although these states are generally characterized by low minimum temperatures, these are somewhat “stable” (they do not show marked kurtosis or skewness in their distributions) – perhaps allowing honey bees and beekeepers to adapt to more predictable conditions. On the other hand, states with higher losses during the first quarter such as New Mexico have higher minimum temperatures as well as marked kurtosis, and thus higher chances of extreme minimum temperatures – which may indeed affect honey bee behavior and colony loss. Overall, across all quarters of the years 2015-2021, we found that normalized colony losses and mean minimum temperatures are negatively associated (the Pearson correlation is -0.17 with a p(text {value}<10^{-6}) and a sample size of 937). Among all quarters, we also found that the second and the third over the same period showed significantly different kurtosis and skewness of minimum temperatures between states with high and low normalized colony losses (t-tests for the difference in mean between minimum temperature kurtosis and skewness for states above and below the overall normalized colony loss median provide p-values smaller than (10^{-4}) and (10^{-3}), respectively, for a sample size of of 472). Similarly, meaningful associations can be outlined also for other indexes we constructed (see Supplementary Figs. S9-S11), lending additional support to our up-scaling approach. However, these are all “marginal” findings, concerning one potential predictor at a time. Our next task is to move to an analysis that accounts for multiple relationships.

Joint modeling highlights the roles of Varroa destructor, pesticides and extreme weather events

To construct an effective and interpretable model comprising multiple predictors, we need to select which, among the variables at our disposal (including the additional indexes we built by up-scaling weather data), are jointly most predictive of honey bee colony loss. This feature selection exercise is rendered more complex by at least two factors. First, the candidate features we consider, especially the indexes produced by up-scaling, present strong collinearities (see Supplementary Fig. S8). Second, because of the coarse spatio-temporal resolution at which they are measured, most of our variables are likely to aggregate several underlying stochastic mechanisms and thus to contain spurious “contaminations” and outliers that can induce biases and hinder both feature selection and estimation of effects. Hence, while assuming that the majority of the observations are generated by one mechanism (the one being modeled), we need a procedure that can exclude a portion of them from the model fit. For this joint analysis, unlike the descriptive statistics described above, we only consider data covering the years 2015-2019, as honey bee data for 2020-2021 may be affected by the Covid-19 outbreak, and they may also require further validation from the United States Department of Agriculture-National Agricultural Statistics Service (USDA-NASS). Results from an extended analysis covering 2015-2021 are reported in Supplementary Table S13 for comparison, and they are consistent with the ones based on 2015-2019 data.

Figure 5

Spatial representation (by state) of median values for four different indices regarding colony loss and minimum temperature in the first quarter (January-March) of seven consecutive years (2015–2021) for each state. (a) Normalized honey bee (Apis mellifera) colony loss. (b) Mean of minimum temperatures. (c) Kurtosis of minimum temperatures (how “extreme” the minimum temperatures were). (d) Skewness of minimum temperatures (whether they tended to concentrate in their lower or upper range). In each panel, the color attributed to a state represents the median of seven index values (first quarter of seven years). North Dakota shows a relatively low normalized colony loss (panel (a)), one of the lowest mean minimum temperature (panel (b)), and one of the lowest minimum temperature kurtosis (panel (c)). This suggests that consistently low minimum temperatures during the first quarter (low mean and low kurtosis) may be associated with lower colony loss in that state. The map has been generated by the authors in R 3.6.249.

Full size image
Table 1 Features selected using the mixed-integer programming procedure described in Insolia et al. (2021)50 for the years 2015-2019, with corresponding coefficient estimates, standard errors, t-statistics and p-values computed on a subset encompassing 90% of the observations (these are 607 selected as “non-outlying”, concurrently with the feature selection). Group-constraints are used to ensure that the terms introduced to represent each categorical control, e.g., the three terms representing quarters (the first one being January-March), are either all selected or all excluded (the categorical controls used are year, quarter and climatic region). The model has an (R^2=0.601).
Full size table

After transforming normalized colony loss values (per quarter, per state) into log-odds ratios, we regressed them on the features at our disposal. These are 24 features in total, encompassing several stressors (V. destructor, pests and parasites, diseases, pesticides, etc.), weather-related information (various indexes computed on minimum and maximum temperatures and precipitation), land use, as well as categorical controls for climatic regions, years and quarters; see Data for details. On this set of variables, we applied state-of-the-art statistical learning tools for simultaneous feature selection and outlier detection (see Statistical model for details). Specifically, we employed a combinatorial procedure developed by our group50, which selects subsets of relevant features and non-outlying points and, on such subsets, is equivalent to an ordinary least squares estimator. Table 1 shows results produced by the procedure when the proportion of outliers to be excluded from the fit is set to 10%. Details on parameter tuning, out-of-sample prediction performance and model diagnostics are provided in Supplementary Figs. S16-S19; similar results obtained with alternative models and estimation methods are discussed in Supplementary Table S8.

Only 15 out of 25 features (including the intercept) were selected as relevant and provided an (R^2=0.6). In considering estimated coefficients and their signs, recall that negative/positive signs correspond to positive/negative impacts on honey bees (that is, lower/higher colony loss) and that estimates in a joint model need to be interpreted in context (that is, conditional on the other features in the model “being held fixed”). We must note that the following findings take several potential stressors and spatio-temporal controls into account and are thus more complete, informative and interpretable than marginal analyses. Specifically, the descriptive results reported in Figures 2–5 consider indices in isolation from others and are consequently not fully comparable to the modeling results which take into account several stressors and control variables. All categorical controls were among the relevant features. Signs here are relative to the reference categories, which are “Central” for regions, “2019” for years and “4th” for quarters (these references do not appear in the Table). In terms of spatial effects, Southeast experiences the highest losses; Southwest and Central (the reference category) have similarly high losses, while all other regions show significantly lower losses – with particularly large decreases for Northwest and West North Central (these findings are in line with a separate analysis conducted with state-level controls; see Supplementary Table S9). In terms of temporal effects, years do not appear to have a large impact overall – aside from 2015, when losses were significantly higher. Quarters have a larger impact, with first and second quarters characterized by significantly elevated and reduced losses, respectively – likely due to the fact that most vulnerable colonies die in the Winter, leaving a much healthier population at the start of the Spring (see also the box plots in Fig. 2a; the third quarter estimated coefficient is positive, but the effect is not significant).

Even in conjunction with the spatio-temporal controls, and consistently with existing literature, we found that the presence of V. destructor9,10,11,15, the use of pesticides51,52,53 and “other” factors appear to be positively associated with honey bee colony loss16,28. Although our findings are not sufficient to draw definite conclusions on causal mechanisms, the statistical associations we documented provide insights and offer hypotheses that can guide future research. Based on the USDA-NASS definition, “other” is a very broad feature which includes factors such as weather, starvation, insufficient forage, queen failure, hive being damaged or destroyed, etc. (see Data); we employ it as an additional control variable, which can usefully mitigate confounding effects, but we do not attempt to interpret it – as we do not have a way to assess the role of individual factors within it based on the data at our disposal. Importantly, “other” does provide a significant signal in modeling honey bee colony loss (p(text {value}<10^{-4})). Similarly, we notice that the variable “other pests and parasites” (tracheal mites, nosema, hive beetle, wax moths, etc.; see again Data) appears to be significant but, as for the variable “other”, we do not attempt to interpret it in depth because of its broad definition – we only use it as a control variable. However, if one were to attempt an interpretation of the negative sign of this variable, it may be, at least in part, due to the presence of collinearities within the selected features; see Supplementary Fig. S19. The marginal correlation between “other pests and parasites” and losses is positive, but this feature is also correlated with V. destructor, “other” and “pesticides”, which increase losses54. To some extent, these features all capture related effects and disentangling the roles of correlated predictors is non-trivial (see Supplementary Table S7). The negative estimated coefficient when “other pests and parasites” is evaluated jointly with V. destructor and “pesticides” could be interpreted as a conditional proxy for the beneficial effect of beekeepers’ expertise, since the pathogens in this category are likely harder to detect and treat compared to, e.g., V. destructor. Unfortunately, this hypothesis cannot be validated empirically in the current study, due to the lack of information regarding beekeepers’ expertise in USDA-NASS data. Thus, although the estimated sign for “other pests and parasites” is different than what one would expect based on a marginal analysis, different estimation techniques and modeling strategies support this result (see Supplementary Tables S8-S12). The issue certainly warrants further investigation in the future.

Consistent with our initial likelihood ratio test, also six among the indexes produced by up-scaling weather data were selected as relevant by the procedure. These concern the distributions of minimum temperatures, maximum temperatures and precipitations. Standard deviation, skweness, and kurtosis of minimum temperatures appear to significantly increase losses – suggesting an aggravating role for variability in general, and more specifically for extreme minimum temperature events (concentration on extreme values and tail heaviness of the distribution). This is confirmed by the significant negative effect of the minimum temperatures alpha index – another indicator of the frequency of extreme events; an increase in the index signifies a decrease in extreme events48. Extreme maximum temperatures, as captured by their kurtosis, also appear to significantly increase losses, as does the entropy of precipitations. The latter could be interpreted as an effect of the inconsistency of precipitation patterns within a given state and quarter, which may affect the effectiveness of foraging behaviors (bees do not fly during heavy precipitation) and thus increase the probability of colony loss. This supports existing studies connecting colony loss with changing weather patterns24,26,27,37,43,55,56.

Finally, the “green-area index”, which captures urbanization39 (it is lower/higher for more/less urban areas; see Data processing for more detail on the definition), was selected as relevant by our procedure, with a significant positive effect. This suggests that, conditional on all other features included in the model, losses increase when green areas are more abundant. This is in contrast with the result in Naug (2009)39 which, however, was based on a regression of losses on land use alone. Indeed, we found that the sign of this relationship does depend on the joint model and data considered, e.g., it can change as one changes the controls included in the model and the set of observations detected as outliers and removed from the fit (see Supplementary Table S8). For instance, using state-level controls in place of regional controls, while not affecting sign and significance of other effects, results in a non-significant negative effect of “green-area index” (p(text {value}=0.9); see Supplementary Table S9). The way this index was constructed supports the intuition that it captures state-level variability (see Data processing), and its marginal correlation with other selected features is very weak (see Supplementary Figs. S15, S19); further investigation of its association with colony loss conditional on control factors and other features is clearly warranted, also in local-scale studies. We also remark that green areas, particularly crops, may offer transient forage to honey bees, having a detrimental effect on the diversity and availability of forage. Moreover, due to pesticide use, green spaces corresponding to crops may have an additional negative impact on honey bees’ health. To investigate these effects, we decomposed the “green-area index” treating crops and other green areas separately, and found that our results do not change introducing such decomposition. This is likely due to the fact that the two component parts and the overall index are correlated and thus capture similar effects – e.g., the “green-area index” and the index based only on crops have an overall Pearson correlation of 0.84 (this relationship becomes even stronger if we control for states or climatic regions).

In terms of outlier detection, our procedure identified some locations and periods which experienced unexpectedly “high” or “low” honey bee colony losses compared to the overall trend. These terms indicate observations for which the estimated regression residuals are much larger in magnitude than the remaining cases, and are characterized by a positive or negative sign, respectively. Specifically, we found that unexpectedly high losses tend to cluster in the third quarter in West North Central (Nebraska, South Dakota) or Southern regions (Arkansas, Kansas) – i.e., areas where expected losses are low. In contrast, unexpectedly low losses tend to cluster in the third quarter in Northeastern (New Jersey, Vermont) and Southern (Louisiana, Oklahoma) regions. Both types of unexpected events are less frequent in the period with highest expected losses, which is likely due to overwintering impacts, and the year 2015 accounted for a significant number of those (especially with lower losses); see Supplementary Table S6 for details. The distribution of the points detected as outliers deviates from the remaining observations as well. For instance, unexpectedly high losses are associated with lower levels of V. destructor, but the opposite holds for unexpectedly low losses. Outlying cases also showed markedly lower levels of the variables “other” and “other pests and parasites” (supporting the fact that they capture additional features of the error distribution compared to the presence of V. destructor) and larger values of the “green-area index”; see Supplementary Fig. S20.


Source: Ecology - nature.com

Features of urban green spaces associated with positive emotions, mindfulness and relaxation

Using game engines and “twins” to co-create stories of climate futures