Global statistical analysis
Our first attempt to identify plausible effects of meteorological covariates on COVID-19 spread applied a comparative regression analysis. To this end, we focused on the exponential onset of the disease, as it is the epidemic phase that allows for a better comparison between countries or regions, without the confounding effect of intervention policies. We first determined, for each of the spatial units (either countries or NUTS (nomenclature of territorial units for statistics) 2 regions), the day in which 20 or more cumulative cases were officially reported. We then fitted the first-order polynomial function f(t) = x0 + rt for the next 20 days of log-transformed data, where t represents time (in days) and ({{x}_0}) is the value at initial condition t = 0. The r parameter can be understood as the exponential growth rate, and is then used to estimate the basic reproduction number (R0) using the estimated serial interval T for COVID-19 of 4.7 days53, such that R0 = 1 + rT (ref. 54). (We note that we are interested here in the relationship between the reproductive number and not in the actual inference of R0.) Once R0 was obtained for all our spatial units, we filtered our meteorological data to match the same fitting period (with a 10-day negative delay to account for an incubation and reporting lapse) for every spatial unit. To compute a single average of the meteorological variables per regional unit, we computed a weighted average on the basis of the population contribution of each grid cell to the total population of the region. We did so to have an aggregated value that would better represent the impact of these factors on the population transmission of COVID-19, as the same variation in weather in a high-density urban area is more likely to contribute to a change in population-level transmission than that of an unpopulated rural area. We then averaged the daily values of temperature and AH for each country and computed univariate linear models for each of these variables as predictors of R0. Given the somewhat arbitrary criteria to select the dates to estimate the R0 in each country, a sensitivity analysis was run to test the robustness of the regressions to changes in the related parameters. We tested 70 different combinations of two parameters: the total number of days used for the fit (18–27) and the threshold of cumulative COVID-19 cases used to select the initial day of the fit (15–45). We also calculated the weather averages by shifting the selected dates accordingly. Then, a linear model for each of the estimates was fitted for both T and AH. A summary of the distribution of parameter estimates (the regression slope coefficients and the R2 of the models) is shown in Extended Data Fig. 3.
Bivariate time-series analysis with scale-dependent correlations
To examine associations between cases and climate factors in more detail, SDC was performed on the daily time series of both COVID-19 incidence and a given meteorological variable. SDC is an optimal method for identifying dynamical couplings in short and noisy time series20,21. In general, Spearman correlations between incidence and a meteorological time series assess whether there is a monotonic relation between the variables. SDC analysis was specifically developed to study transitory associations that are local in time at a specified temporal scale corresponding to the size of the time intervals considered (s). The two-way implementation (TW-SDC) is a bivariate method that computes non-parametric Spearman rank correlations between two time series, for different pairs of time intervals along these series. Different window sizes (s) can be used to examine increasingly finer temporal resolution. The results are sensitive to the value of this window size, s, with expected significant and highest correlation values at the scale of the transient coupling between variables. Correlation values decrease in magnitude as window size increases, and averages are computed over too long a time interval. Values can also decrease and become non-significant for small windows when correlations are spurious. Here, the method was applied for windows of different length (from s = 75 to 14 days) and, despite a weekly cycle showing up in some cases for small s, results removing this cycle were robust. We therefore did not remove this cycle.
The results are typically displayed in a figure with the following subplots: (1) the two time series, to the left and top of the matrix of correlation values, respectively; (2) the matrix or grid of correlation values itself in the center, with significant correlations colored in blue when positive and in red when negative, with rows and columns corresponding to the temporal localization of the moving window along the time series on the left and top, respectively; (3) a time series at the bottom, below this grid, with the highest significant correlations for a given time (vertically, and therefore for the variable that acts as the driver, here the meteorological time series). To read the results, one starts at the diagonal and moves vertically down from it to identify a given lag for which significant correlations are found (the closest to the main diagonal). In some of the SDC figures, the time intervals with high local correlations are highlighted with boxes. These intervals alternate with other ones (left blank) for which no significant correlation is found. All colored areas correspond to significance levels of at least P < 0.05. A new presentation of results is also used for the influenza analysis, in which the two time series are superimposed in the same plot with the significant correlations shown in a panel below as a function of time and lag (Extended Data Fig. 9).
Significance is assessed with a non-parametric randomization test (see ref. 20 for further details and for examples illustrating the method). For the baseline test, SDC calculates Spearman correlations (at α = 0.05) between two white-noise time series at each fragment size s for a non-parametric permutation test (Supplementary Fig. 7). The indices of the series are randomly re-ordered, breaking their temporal shape. This permutation test enables a first estimate of the probability of finding significant spurious correlations, and it can thus be used as a non-parametric significance test for pairs of any length for the time series of interest. As seen in Supplementary Fig. 7, the threshold decays rapidly as the fragment size grows, with high values becoming rarer the longer the time series are compared. As a second test, SDC evaluates the unwanted effects of the internal autocorrelation in a time series. This effect might artificially inflate the correlation obtained between two time series, and should therefore be taken into consideration properly in a significance comparison. To estimate this effect, we generate pairs of autocorrelated (red) noise series of 365 time steps, with a mean μ = 0 and standard deviation σ = 1 and varying degrees of autocorrelation (autocorrelation parameter φ in Supplementary Fig. 8) from 0 (equivalent to white noise) to 0.95, in steps of 0.05. We repeated this procedure 20 times for every unique combination of parameters to achieve a robust estimate. The method then searches for significant couplings in either direction. This is carried out for each of these synthetic time-series pairs with an SDC analysis (for example, with s = 30), yielding, for each of them, the false discovery rate, which accounts for a type I error and provides the rate of significant couplings at α = 0.05. In Supplementary Fig. 8b, we show the average false discovery rate of the tests for each pair of autocorrelation values. As shown, the chance of finding a spurious coupling increases monotonically as a function of the autocorrelation evaluated.
Our approach focuses on analyzing temporal associations in one location at a time, and comparing across these locations the patterns of association themselves, including their timing (for example, in the waning or rising phase of epidemic waves). This allows comparison of results among distinct regions, despite differences in control measures and disease epidemic state. Time-series analyses applied to each location do not present the typical problems of comparative spatial regression studies, which might be biased by uncontrollable confounding effects across spatial locations.
One reason why couplings between two variables in ecological or epidemiological systems may be transient couplings is the existence of thresholds above or below which responses to forcing are weak or absent. To interrogate our analyses for the existence of critical thresholds/ranges for optimal transmission of the virus (inferred from COVID-19 disease outcomes), we pooled all negative and positive significant correlations performed at a size s = 21 days between each meteorological variable and COVID-19 cases. We then computed the proportion of those negative correlations among all possible correlations for a given fragment size obtained for each bin of T or AH values and plotted their distribution (Fig. 3 and Supplementary Figs. 4–6) for all individual regions in Italy.
Singular spectrum analysis (SSA) involves the spectral decomposition (eigenvalues and corresponding eigenvectors) of a covariance matrix obtained by lagging the time-series data for a prescribed number of lags M called the embedding dimension. There are two crucial steps in this analysis for which there are no formal results, but useful rules of thumb: (1) the choice of M and (2) the grouping of the eigenvectors to define the specific major components and reconstruct them. Typically, the grouping of the eigencomponents is based on the similarity and magnitudes of the eigenvalues, their power (variance of the data they account for) and the peak frequency of the resulting reconstructed components. For selection of the embedding dimension, one general strategy is to choose it so that at least one period of the lowest-frequency component of interest can be identified, that is, M > fs/fr, where fs is the sampling rate and fr the minimum frequency. Another strategy is that M be large enough that the M-lagged vector incorporates the temporal scale of the time series that is of interest. The larger the M, the more detailed the resulting decomposition of the signal. In particular, the most detailed decomposition is achieved when the embedding dimension is approximately equal to half of the total signal length. A compromise must be reached, however, as a large M implies increased computation, and too large a value may produce mixing of components. SSA is especially well suited for separating components corresponding to different frequencies in nonlinear systems. Here, we applied it to remove the weekly cycle.
MSDC analysis
MSDC provides a scan of the SDC analyses over a range of different scales (here, S from 5 to 100 days at 5-day intervals), by selecting the maximum correlation values (positive or negative) closer to the diagonal. The goal is to consider the evolution of transient correlations at all scales pooled together in a single analysis. The MSDC plot displays time on the x axis and scale (S) on the y axis, and positive and negative correlations either jointly or separately. The rationale behind MSDC is that correlations at very small scales can occur by chance because of coincident similar patterns, but that as one moves up to larger scales (by increasing S), the correlation patterns that are spurious tend to vanish, whereas those reflecting mechanistic links increase in strength. This increase in correlation values should occur up to the real scale of interaction, decreasing afterwards. By ‘real’, we mean here the temporal scale covering the extent of the interaction between the driver and the response process (in this case, the response of disease transmission to a given climate factor). Thus, continuity of the same sign correlations together with transitions to larger values are indicative of causal effects, whereas the rapid vanishing of small-scale significant correlations signals spurious ones.
Process-based model
Description
The dynamical model is a discrete stochastic model that incorporates seven different compartments: S, E, I, C, Q, R and D. The model structure is illustrated in Fig. 4. The transition probabilities of the stochastic model are based on the corresponding rates of the transitions between classes in the deterministic (mean-field) model (specified in Fig. 4b). These probabilities are defined as follows. P(e) = (1.0 − exp(−β dt)) is the probability of infection exposure of the susceptible class, where β = (1/N)(βII + βQQ) is the infection rate (of the deterministic model). P(i) = (1.0 − exp(−γ dt)) is the probability that an new exposed individual becomes infectious, where γ denotes the incubation rate. P(r) = (1.0 − exp(−Λ dt)) is the recovery probability, where λ0(1 − exp(λ1t)) is the (deterministic) recovery rate. P(p) = (1.0 − exp(−α dt)) is the protection probability, where α = α0exp(α1t). P(d) = (1.0 − exp(−K dt)) is the mortality probability, with K = k0exp(k1t). P(re) = (1.0 − exp(−τ dt)) is the release probability from confinement, where τ = τ0exp(τ1t). Finally, P(q) = (1.0 − exp(−δ dt)) is the detection probability, where δ is the quarantine rate (for example, at which infected individuals are isolated from the rest of the population).
In the model, both infected non-detected and infected detected individuals can infect susceptible ones. In the model incorporating temperature in the transmission rate, the respective values of βI and βQ are calculated as follows:
$${beta }_{I}(t)={beta }_{I},T_{mathrm{inv}}(t);quad {beta }_{Q}(t)={beta }_{Q},T_{mathrm{inv}}(t)$$
where (T_{mathrm{inv}}=fleft(frac{1-T(t)}{bar{T}}right)), with (bar{T}) corresponding to the overall mean of the temperature time series and f(·) to a Savitzky–Golay filter, used to smooth the temperature series with a window size of 50 data points and a polynomial order of 3. When the infection rate is constant, we simply omit the temperature term. For further comparison, in a third model, β is specified with a sinusoidal function of period equal to 12 months and an estimated phase.
The number of individuals transitioning from compartment i to j at time t are determined by means of binomial distributions P(Xi,P(y)), where Xi corresponds to one of the compartments S, E, I, Q, R, D, C, and P(y) to the respective transition probability defined above. Thus,
e(t) = P(S(t), P(e)), new exposed individuals at time t
p(t) = P(S(t), P(p)), protected individuals at time t
i(t) = P(E(t), P(i)), new infected not detected individuals at time t
q(t) = P(I(t), P(q)), new infected and detected individuals at time t
r(t) = P(Q(t), P(r)), total recovered individuals at time t
d(t) = P(Q(t), P(d)), total dead individuals at time t
re(t) = P(C(t), P(re)), individuals released from confinement at time t
Then, the final dynamics are given by the following equations:
$$S(t)=S(t-{rm{d}}t)-e(t)-p(t)+re(t)$$
$$E(t)=E(t-{rm{d}}t)+e(t)-i(t)$$
$$I(t)=I(t-{rm{d}}t)+i(t)-q(t)$$
$$Q(t)=Q(t-{rm{d}}t)+q(t)-r(t)-d(t)$$
$$R(t)=R(t-{rm{d}}t)+r(t)$$
$$D(t)=D(t-{rm{d}}t)+d(t)$$
$$C(t)=C(t-{rm{d}}t)+p(t)-re(t)$$
Calibration
The model was implemented using Python and calibrated by means of the least squares algorithm of the scipy library. The error function minimized with this algorithm was obtained from the normalized residuals on the basis of total cases (Q + R + D) and deaths (D).
To search parameter space, we ran 100 calibrations starting from different initial choices of parameter combinations. The tolerance for termination in the change of the cost function was set to 1 × 10−10. Tolerance for termination by the norm of the gradient was also set to 1 × 10−10, and the tolerance for termination by the change of the independent variables was set to 1 × 10−10. The solver was the lsmr method (which is suitable for problems with sparse and large Jacobian matrices) with a differential step of 1 × 10−5. With this configuration, each fitting run usually converged after ~500 iterations.
Validation
To compare the model including an effect of T in the transmission rate to those without it, we calculated the chi-square, Akaike information criterion (AIC) and Bayesian information criterion (BIC) indices for the residuals obtained from the optimization process. The resulting values are shown in Supplementary Table 1.
Our choice of T to modulate the infection rate (β) instead of AH underlies the fact that the temporal dynamics of both factors roughly follow the same shape, with the advantage that T shows less oscillatory behavior than AH. This fact adds stability to the model when the inverse relationship is used in the calculation of β (Supplementary Information). This selection is further reinforced by the results from the SDC analyses, which yielded larger correlations for temperature, even when penalizing for the larger autocorrelation structure.
Our choice to modulate β using T instead of AH follows from the fact that the temporal dynamics of both climate variables present roughly the same shape, with the advantage that T exhibits weaker oscillations. This less fluctuating pattern provides stability to the model fitting when the inverse relationship is used in the calculation of β (Supplementary Information). Additionally, the transient correlations obtained with SDC yielded higher values for T than for AH (even when accounting for concurrent levels of autoregression in the two variables).
Source: Ecology - nature.com