Generation time
During the infectious period, an infected individual may produce a secondary infection. However, the individual’s infectiousness is not constant during the infectious period, but it can be approximated by the probability distribution of the generation time (GT), which accounts for the time between the infection of a primary case and the infection of a secondary case. Unfortunately, such distribution is not as easy to estimate as that of the serial interval, which accounts for the time between the onset of symptoms in a primary case to the onset of symptoms of a secondary case. This is because the time of infection is more difficult to detect than the time of symptoms onset. Ganyani et al.27 developed a methodology to estimate the distribution of the GT from the distributions of the incubation period and the serial interval. Assuming an incubation period following a gamma distribution with a mean of 5.2 days and a standard deviation (SD) of 2.8 days, they estimated the serial interval from 91 and 135 pairs of documented infector-infectee in Singapore and Tianjin (China). Then, they found that the GT followed a gamma distribution with mean = 5.20 (95% CI = [3.78, 6.78]) days and SD = 1.72 (95% CI = [0.91, 3.93]) for Singapore (hereafter GT1), and with mean = 3.95 (95% CI = [3.01, 4.91]) days and SD = 1.51 (95% CI = [0.74, 2.97]) for Tianjin (hereafter GT2). Ng et al.28 applied the same methodology to 209 pairs of infector-infectee in Singapore and determined a gamma distribution with mean = 3.44 (95% CI = [2.79, 4.11]) days and SD 2.39 (95% CI = [1.27, 3.45]; hereafter GT3). Figure 3 shows the probability density functions (PDF) of such distributions, fGT. The differences between them are remarkable. For example, the 54.5%, 81.0%, and 80.7% of the contagions are produced in a pre-symptomatic stage (in the first 5.2 days after primary infection) assuming GT1, GT2, and GT3, respectively.
Probability density function of the generation time distribution, fGT(t), of GT1 (blue line; Singapore27), GT2 (yellow line; Tianjin27), GT3 (red line; Singapore28), and GTth (black line; theoretical distribution). Bars are the discretized version, (widetilde{{f_{GT} }}left( n right)), of the PDF of GTth.
Theoretically, assuming that the incubation periods of two individuals are independent and identically distributed, which is quite plausible, the expected/mean values of the GT and the serial interval should be equal29,30. The mean of the serial interval is easier to estimate than that of the GT. For that reason, we assume a mean serial interval as estimated from a meta-analysis of 13 studies involving a total of 964 pairs of infector-infectee, which is 4.99 days (95% CI = [4.17, 5.82])31, is more reliable than the aforementioned means of the GT. This value is within the error estimates of the means of GT1 and GT2, but not for GT3. Then, we construct a theoretical distribution for the GT that follows a gamma distribution (hereafter GTth) with mean = 4.99 days and SD = 1.88 days. This theoretical distribution can be seen in Fig. 3 and approximates the average PDF of three gamma distributions with mean = 4.99 and the SD of GT1, GT2, and GT3. We assume a conservative CI = [1.51, 2.39] for the theoretical SD, defined with the minimum and maximum SD values of GT1, GT2, and GT3. GTth shows 63.1% of pre-symptomatic contagions.
R
0
from r
In theory, the basic reproduction number R0 can be estimated as far as the intrinsic growth rate r, and the distributions of both the latent and infectious periods are known26,32,33,34. The latent period accounts for the period during which an infected individual cannot infect other individuals. It is observed in diseases for which the infectious period starts around the end of the incubation period, as happened with influenza35 and SARS36. However, from Fig. 3 it is inferred that COVID-19 is transmissible from the moment of infection, and we will assume a null latent period. Then, if the GT follows a gamma distribution, R0 can be estimated from the formulation of Anderson and Watson32, which was adapted to null latent periods by Yan26 as
$$ R_{0} = frac{{mean_{GT} }}{{1 – left( {1 + mean_{GT} cdot r cdot frac{1}{{shape_{GT} }}} right)^{{ – shape_{GT} }} }} cdot r, $$
(4)
where meanGT is the mean GT and shapeGT is one of the two parameters defining the gamma distribution, which can be estimated as
$$ shape_{GT} = frac{{left( {mean_{GT} } right)^{2} }}{{left( {SD_{GT} } right)^{2} }}. $$
(5)
For GTth, we get R0 = 1.50 (CI = [1.41, 1.61]) for REMEDID I(n) and R0 = 1.76 (CI = [1.60, 1.94]) for official I(n). For the other three GT distributions, R0 ranges from 1.39 (CI = [1.27, 1.58]) to 1.51 (CI = [1.34, 1.80]) for REMEDID I(n) and from 1.59 (CI = [1.40, 1.88]) to 1.78 (CI = [1.51, 2.23]) for official I(n) (Table 1). In all cases, R0 from GTth are within those from the three known GT distributions and indistinguishable from them within the error estimates. The lower (upper) bound of the CI is estimated as the minimum (maximum) R0 obtained from all the possible combinations of 100 evenly spaced values covering the CI of r, meanGT and SDGT. Then, following the Bonferroni correction, the reported CI present at least a 85% of confidence level for GT1, GT2, and GT3, but it cannot be assured for GTth since the CI of its SD is unknown. In general, all these R0 estimates are lower than those summarised by Park et al.20.
Alternatively, R0 can be estimated by applying the Euler–Lotka equation29,33,
$$ R_{0} = frac{1}{{mathop smallint nolimits_{0}^{ + infty } e^{ – rt} cdot f_{GT} left( t right)dt}}. $$
(6)
In this case, we get values closer to previous estimates20. In particular, for GTth, we get R0 = 2.12 (CI = [1.81, 2.48]) for REMEDID I(n) and R0 = 2.92 (CI = [2.28, 3.75]) for official I(n). For the other three GT distributions, R0 ranges from 1.63 (CI = [1.43, 1.90]) to 2.21 (CI = [1.59, 2.95]) for REMEDID I(n) and from 1.97 (CI = [1.59, 2.54]) to 3.11 (CI = [1.84, 4.90]) for official I(n) (Table 1). The CI are estimated as in Eq. (4).
R0 from a dynamical model
We designed a dynamic model with Susceptible-Infected-Recovered (SIR) as stocks that accounts for the infectiousness of the infectors. Such a model is a generalisation of the Susceptible-Exposed-Infected-Recovered (SEIR) model37. Births, deaths, immigration and emigration are ignored, which seems reasonable since the timescale of the outbreak is too short to produce significant demographic changes. For the sake of simplicity, the recovered stock includes recoveries and fatalities, and it is denoted as R(t). A random mixing population is assumed, that is a population where contacts between any two people are equally probable. Time is discretized in days, so the real time variable t is replaced by the integer variable n. As a consequence, the derivatives in the differential equations defining the dynamic model explained below are discrete derivatives.
The size of the population is fixed at N = 100,000, and then, for any day n we get
$$ tilde{S}left( n right) + left( {mathop sum limits_{k = 0}^{20} tilde{I}left( n-k right)} right) + tilde{R}left( n right) = N, $$
(7)
where (tilde{S}left( n right)), (tilde{I}left( n right)), and (tilde{R}left( n right)) are the discretized versions of S(t), I(t), and R(t) and (tilde{I}) is assumed to be null for negative integers. The summation is a consequence of the infectiousness, which is approximated according to the GT, whose PDF is discretized as
$$ widetilde{{f_{GT} }}left( n right) = mathop smallint limits_{n – 1}^{n} f_{GT} left( t right) dt, $$
(8)
from n = 1 to 20. Figure 3 shows (widetilde{{f_{GT} }}left( n right)) for GTth. Truncating at n = 20 accounts for 99.99% of the area below the PDF of all the GT. Then, an infected individual at day n0 is expected to produce on average
$$ widetilde{{R_{e} }}left( {n_{0} + n} right) cdot widetilde{{f_{GT} }}left( n right) $$
(9)
infections n days later, where (widetilde{{R_{e} }}left( n right)) is the discretized version of Re(t). From this expression, it is obvious that values of (widetilde{{R_{e} }}left( n right) < 1) will produce a decline of infections. Conversely, infections at day n0 are produced by all individuals infected during the previous 20 days as
$$ tilde{I}(n_{0} ) = tilde{R}_{e} left( {n_{0} } right) cdot left( {mathop sum limits_{n = 1}^{20} tilde{I}left( {n_{0} – n} right) cdot widetilde{{f_{GT} }}left( n right)} right), $$
(10)
whose continuous version has been reported in previous studies29,38. The expression in brackets is called total infectiousness of infected individuals at day n039. According to Eq. (1), Eq. (10) can be expressed in terms of R0 as
$$ tilde{I}(n_{0} ) = R_{0} cdot frac{{tilde{S}left( {n_{0} } right)}}{N} cdot left( {mathop sum limits_{n = 1}^{20} tilde{I}left( {n_{0} – n} right) cdot widetilde{{f_{GT} }}left( n right)} right). $$
(11)
As we want a dynamic model capable of providing (tilde{I}left( {n_{0} } right)) from the stocks at time step n0 − 1, we replaced (tilde{S}left( {n_{0} } right)) by (tilde{S}left( {n_{0} – 1} right)) in Eq. (11). This assumption makes sense in a discrete domain since the infections at time n0 take place in the susceptible population at time n0 − 1. Then, assuming that all stocks are set to zero for negative integers, our dynamic model can be expressed in terms of Eq. (7) and the following differential equations:
$$ delta tilde{I}(n_{0} ) = R_{0} cdot frac{{tilde{S}left( {n_{0} – 1} right)}}{N} cdot left( {mathop sum limits_{n = 1}^{20} tilde{I}left( {n_{0} – n} right) cdot widetilde{{{text{f}}_{GT} }}left( n right)} right) – tilde{I}(n_{0} – 1), $$
(12)
$$ delta tilde{S}left( {n_{0} } right) = {-}tilde{I}left( {n_{0} } right), $$
(13)
$$ delta tilde{R}left( {n_{0} } right) = tilde{I}left( {n_{0} – 21} right), $$
(14)
where (delta tilde{I}), (delta tilde{S}), and (delta tilde{R}) are the (discrete) derivatives of (tilde{I}), (tilde{S}), and (tilde{R}), respectively. Applying the initial conditions (tilde{S}left( 0 right) = N – 1), (tilde{I}left( 0 right) = 1), and (tilde{R}left( 0 right) = 0), it is assumed that the outbreak was produced by only one infector. The latter is not true in Spain, since several independent introductions of SARS-CoV-2 were detected40. However, for modelling purposes it is equivalent to introducing a single infection at day 0 or M infections produced by the single infection n days later. Then, the date of the initial time n = 0 is accounted as a parameter date0, which is optimised, as well as R0, to minimise the root-mean square of the residual between the model simulated (tilde{I}left( n right)) and the REMEDID and official I(n) for the period from date0 to n0.
The model was implemented in Stella Architect software v2.1.1 (www.iseesystems.com) and exported to R software v4.1.1 with the help of deSolve (v1.28) and stats (v4.1.1) packages, and the Brent optimisation algorithm was implemented. For REMEDID I(n) and GTth, we obtained date0 = 13 December 2019 and R0 = 2.71 (CI = [2.33, 3.15]). Optimal solutions combine lower/higher R0 and earlier/later date0 (Fig. 4), which highlights the importance of providing an accurate first infection date to estimate R0. When the other three GT distributions were considered, we obtained similar date0, ranging from 12 to 17 December 2019, and R0 values ranging from 2.08 (CI = [1.86, 2.42]) to 2.85 (CI = [2.05, 3.25]; see Table 1). For official infections, date0 was set to 1 January 2020 for all cases, and R0 ranged from 1.81 (CI = [1.64, 2.07]) to 2.41 (CI = [1.80, 2.91]). The CI are estimated as in Eq. (4).
Root-mean square (RMS) of the residuals between infections from the model, which depends on date0 (x-axis) and R0 (y-axis), and REMEDID (from MoMo ED) and official infections. Parameters optimizing the model are highlighted in purple. RMS larger than 1275 (left panel) and 103 (right panel) are saturated in white.
Source: Ecology - nature.com