in

Reliably quantifying the evolving worldwide dynamic state of the COVID-19 outbreak from death records, clinical parametrization, and demographic data

Infection-age structured dynamics

For the description of the dynamics, we follow the customary infection-age structured approach (for details see for instance Refs.4,10,11,12). Explicitly, we consider the infection-age structured dynamics of the number of individuals ({u}_{I}left(t,tau right)) at time (t) who were infected at time (t-tau) given by

$$begin{array}{c}frac{partial }{partial t}{u}_{I}left(t,tau right)+frac{partial }{partial tau }{u}_{I}left(t,tau right)=0end{array}$$

(7)

with boundary condition

$$begin{array}{c}{u}_{I}left(t,0right)=jleft(tright).end{array}$$

(8)

Here, (tau) is the time elapsed after infection, referred to as infection age, and (jleft(tright)={int }_{0}^{infty }{k}_{I}(t,tau ){u}_{I}left(t,tau right)dtau) is the incidence, with ({k}_{I}(t,tau )) being the rate of secondary transmissions per single primary case.

The solution is obtained through the method of characteristics32 as

$$begin{array}{c}{u}_{I}left(t,tau right)=jleft(t-tau right)end{array}$$

(9)

for (tge tau) and ({u}_{I}left(t,tau right)=0) for (t<tau). The resulting renewal equation, (jleft(tright)={int }_{0}^{infty }{k}_{I}left(t,tau right)jleft(t-tau right)dtau), is used as the basis for the definitions of the reproduction number ({R}_{t}={int }_{0}^{infty }{k}_{I}left(t,tau right)dtau) and the probability density of the generation time ({f}_{GT}left(tau right)=frac{{k}_{I}left(t,tau right)}{{R}_{t}} .)

The infectious population is given by

$$begin{array}{c}{n}_{I}left(tright)={int }_{0}^{infty }{P}_{I}left(tau right){u}_{I}left(t,tau right)dtau ,end{array}$$

(10)

which considers that an individual remains potentially infectious after a time (tau) from infection with probability

$$begin{array}{c}{P}_{I}left(tau right)={int }_{tau }^{infty }{f}_{GT}left(lright)dl.end{array}$$

(11)

Therefore, in terms of the incidence [substituting Eq. (9) in Eq. (10)], we have

$$begin{array}{c}{n}_{I}left(tright)={int }_{0}^{infty }{P}_{I}left(tau right)jleft(t-tau right)dtau . end{array}$$

(12)

Additionally, we consider the expected cumulative number of infections, ({n}_{T}left(tright)), expressed in terms of the overall accumulated incidence as

$$begin{array}{c}{n}_{T}left(tright)={int }_{0}^{t}jleft(sright)ds , end{array}$$

(13)

and the dynamics of the expected cumulative deaths, ({n}_{D}left(tright)),

$$begin{array}{c}frac{d}{dt}{n}_{D}left(tright)=IFR{int }_{0}^{t}{f}_{OD}left(t-lright) {int }_{0}^{l}{f}_{I}left(l-sright)jleft(sright)dsdl, end{array}$$

(14)

which takes into account that deaths occur with probability given by the infection fatality rate, (IFR), at times after infection given by the convolution of the probability density functions of the incubation, ({f}_{I}), and symptom onset-to-death, ({f}_{OD}), times.

Similarly, the variation of the expected number of seropositive individuals at a time (t), ({n}_{SP}left(tright)), is expressed as

$$begin{array}{c}frac{d}{dt}{n}_{SP}left(tright)={int }_{0}^{t}{f}_{SP}left(t-sright)jleft(sright)ds,end{array}$$

(15)

where ({f}_{SP}) is the probability density function of the seroconversion time after infection, and the expected number of individuals with positive RT-PCR testing ({n}_{PT}(t)), as

$$begin{array}{c}{n}_{TP}left(tright)={int }_{0}^{infty }{P}_{TP}left(tau right){u}_{I}left(t,tau right)dtau , end{array}$$

(16)

where ({P}_{TP}left(tau right)) is the probability that an infected individual would test positive at a time (tau) after infection.

Dynamical constraints

To obtain a closed set of equations for the different epidemiological quantities, we developed an approach to optimally simplify the convolutions. Explicitly, for the expressions involving an integral ({int }_{0}^{infty }Aleft(tau right)jleft(t-tau right)dtau) of a function (A) with the incidence (j), we perform a series expansion of the incidence around the infection-age time ({tau }_{A}),

$$begin{array}{c}jleft(t-tau right)=jleft(t-{tau }_{A}right)+{j^prime}left(t-{tau }_{A}right)left({tau }_{A}-tau right)+Oleft({j^{primeprime}}right),end{array}$$

(17)

with the value of ({tau }_{A}) chosen as

$$begin{array}{c}{tau }_{A}=frac{{int }_{0}^{infty }tau Aleft(tau right)dtau }{{int }_{0}^{infty }Aleft(tau right)dtau } . end{array}$$

(18)

The specific value of ({tau }_{A}) leads directly to a first-order approximation,

$$begin{array}{c}{int }_{0}^{infty }Aleft(tau right)jleft(t-tau right)dtau =jleft(t-{tau }_{A}right){int }_{0}^{infty }Aleft(tau right)dtau +Oleft({j^{primeprime}}right),end{array}$$

(19)

because ({int }_{0}^{infty }Aleft(tau right)left({tau }_{A}-tau right)dtau =0) by the definition of ({tau }_{A}).

Using this approach, we obtain

$$begin{array}{c}{n}_{I}left(tright)=jleft(t-frac{{tau }_{G}^{2}+{sigma }_{G}^{2} }{2{tau }_{G}}right){tau }_{G}+Oleft({j^{primeprime}}right) end{array}$$

(20)

from Eq. (12), where ({tau }_{G}={int }_{0}^{infty }tau {f}_{GT}left(tau right)dtau) and ({sigma }_{G}^{2}={int }_{0}^{infty }{(tau -{tau }_{G})}^{2}{f}_{GT}left(tau right)dtau) are the average and variance of the generation time, respectively, and

$$begin{array}{c}frac{d}{dt}{n}_{D}left(tright)=IFR jleft(t-{tau }_{I}-{tau }_{OD}right)+Oleft({j^{primeprime}}right)end{array}$$

(21)

from Eq. (14), where ({tau }_{I}={int }_{0}^{infty }tau {f}_{I}left(tau right)dtau) and ({tau }_{OD}={int }_{0}^{infty }tau {f}_{OD}left(tau right)dtau) are the incubation and symptom onset-to-death average times, respectively. These expressions lead straightforwardly to

$$begin{array}{c}frac{d}{dt}{n}_{D}left(tright)=frac{IFR}{{tau }_{G}} {n}_{I}left(t+frac{{tau }_{G}^{2}+{sigma }_{G}^{2} }{2{tau }_{G}}-{tau }_{I}-{tau }_{OD}right)end{array}$$

(22)

and

$$begin{array}{c}{n}_{D}left(tright)=IFR {n}_{T}left(t-{tau }_{I}-{tau }_{OD}right),end{array}$$

(23)

up to (mathcal{O}left({j}^{{^{prime}}{^{prime}}}right)). Note that we have used ({int }_{0}^{infty }2tau {P}_{I}left(tau right)dtau ={int }_{0}^{infty }dleft({tau }^{2}{P}_{I}left(tau right)right)-{int }_{0}^{infty }{tau }^{2}d{P}_{I}left(tau right)={int }_{0}^{infty }{tau }^{2}{f}_{GT}left(tau right)dtau) and ({int }_{0}^{infty }{P}_{I}left(tau right)dtau ={int }_{0}^{infty }dleft(tau {P}_{I}left(tau right)right)-{int }_{0}^{infty }tau d{P}_{I}left(tau right)={int }_{0}^{infty }tau {f}_{GT}left(tau right)dtau).

These expressions are used to estimate the infectious population ({n}_{I}(t)) from the daily deaths, (frac{d}{dt}{n}_{D}), at time (t+{tau }_{I}+{tau }_{OD}-frac{{tau }_{G}^{2}+{sigma }_{G}^{2} }{2{tau }_{G}}) and the cumulative infected population ({n}_{T}left(tright)) from the cumulative deaths, ({n}_{D}), at time (t+{tau }_{I}+{tau }_{OD}), leading to

$$begin{array}{c}{n}_{I}left(tright)=frac{{tau }_{G}}{IFR}frac{d}{dt}{n}_{D}left(t+{tau }_{I}+{tau }_{OD}-frac{{tau }_{G}^{2}+{sigma }_{G}^{2} }{2{tau }_{G}}right),end{array}$$

(24)

$$begin{array}{c}{n}_{T}left(tright)=frac{1}{IFR}{n}_{D}left(t+{tau }_{I}+{tau }_{OD}right).end{array}$$

(25)

Similarly, we obtain

$$begin{array}{c}frac{d}{dt}{n}_{SP}left(tright)=jleft(t-{tau }_{SP}right)+Oleft({j^{primeprime}}right)end{array}$$

(26)

$$begin{array}{c}{n}_{TP}left(tright)=jleft(t-{tau }_{TP}right){Delta t}_{TP}+Oleft({j^{primeprime}}right),end{array}$$

(27)

where ({tau }_{SP}) is the average seroconversion time after infection and (Delta {t}_{TP}) is the average number of days an individual tests positive, which up to (mathcal{O}left({j^{primeprime}}right)) leads to

$$begin{array}{c}{n}_{SP}left(tright)={n}_{T}left(t-{tau }_{SP}right),end{array}$$

(28)

$$begin{array}{c}frac{d}{dt}{n}_{D}left(tright)=frac{IFR}{Delta {t}_{TP}}{n}_{TP}left(t+{tau }_{TP}-{tau }_{I}-{tau }_{OD}right).end{array}$$

(29)

Combining Eqs. ((28)) and ((29)) with Eqs. ((24)) and ((25)) leads to

$$begin{array}{c}{n}_{I}left(tright)=frac{{tau }_{G}}{Delta {t}_{TP}}{n}_{TP}left(t+{tau }_{TP}-frac{{tau }_{G}^{2}+{sigma }_{G}^{2} }{2{tau }_{G}}right),end{array}$$

(30)

$$begin{array}{c}{n}_{T}left(tright)={n}_{SP}left(t+{tau }_{SP}right),end{array}$$

(31)

which is used to validate the values of the estimated infectious population ({n}_{I}(t)) from RT-PCR testing results, ({n}_{TP}), at time (t+{tau }_{TP}-frac{{tau }_{G}^{2}+{sigma }_{G}^{2} }{2{tau }_{G}}) and the cumulative infected population ({n}_{T}left(tright)) from seropositivity testing, ({n}_{SP}), at time (t+{tau }_{SP}).

Expected deaths

The raw cumulative death counts over time, ({n}_{W}left(tright)), are obtained from the Johns Hopkins University Center for Systems Science and Engineering<a data-track="click" data-track-action="reference anchor" data-track-label="link" data-test="citation-ref" aria-label="Reference 1" title="JHU CSSE COVID-19 Data, (2020).” href=”https://www.nature.com/articles/s41598-021-99273-1#ref-CR1″ id=”ref-link-section-d18334982e8930″>1 for countries and for US locations.

The daily death counts (Delta {n}_{W}left(tright)={n}_{W}left(tright)-{n}_{W}left(t-1right)) are considered to contain reporting artifacts if they are negative or if they are unrealistically large. This last condition is defined explicitly as larger than 4 times its previous 14-day average value plus 10 deaths, (Delta {n}_{W}left(tright)>10+4times frac{1}{14}left({n}_{W}left(tright)-{n}_{W}left(t-14right)right)), from a non-sparse reporting schedule with at least 2 consecutive non-zero values before and after the time (t), (Delta {n}_{W}left(tright)ne frac{1}{5}left({n}_{W}left(t+2right)-{n}_{W}left(t-3right)right)).

Reporting artifacts identified at time (t) are considered to be the result of previous miscounting. The excess or lack of deaths are imputed proportionally to previous death counts. Explicitly, death counts are updated as

$$begin{array}{c}{n}_{W}left(t-1-iright)leftarrow {n}_{W}left(t-1-iright)frac{{n}_{W}{left(t-1right)}_{estimated}}{{n}_{W}left(t-1right)}end{array}$$

(32)

with ({n}_{W}{left(t-1right)}_{estimated}={n}_{W}left(tright)-frac{1}{7}left({n}_{W}left(t-1right)-{n}_{W}left(t-8right)right)) for all (ige 0). In this way, (Delta {n}_{W}left(tright)) is assigned its previous seven-day average value.

The expected daily deaths, (Delta {n}_{D}(t)), are obtained through a density estimation multiscale functional, ({f}_{de}), as (Delta {n}_{D}(t)={f}_{de}left(Delta {n}_{W}left(tright)right)), which leads to the estimation of the expected cumulative deaths at time (t) as ({n}_{D}left(tright)={n}_{W}left({t}_{0}right)+{sum }_{s={t}_{0}+1}^{t}Delta {n}_{D}(s)). Specifically,

$$begin{array}{c}{f}_{de}left(Delta {n}_{W}left(tright)right)=left(1-{r}_{1}right)d{d}_{0}+{r}_{1}left(left(1-{r}_{2}right)d{d}_{1}+{r}_{2}d{d}_{2}right)end{array}$$

(33)

with

$$begin{array}{c}{r}_{1} = {e}^{-0.3d{d}_{1}},end{array}$$

(34)

$$begin{array}{c}{r}_{2} = {e}^{-3d{d}_{2}},end{array}$$

(35)

$$begin{array}{c}d{d}_{0}={ma}_{14}left({ma}_{14}left(Delta {n}_{W}left(tright)right)right),end{array}$$

(36)

$$begin{array}{c}d{d}_{1}={rg}_{12}left({ma}_{14}left(Delta {n}_{W}left(tright)right)right),end{array}$$

(37)

$$begin{array}{c}d{d}_{2}={rg}_{48}left({ma}_{14}left(Delta {n}_{W}left(tright)right)right),end{array}$$

(38)

where ({ma}_{14}left(cdot right)) is a centered moving average with window size of 14 days and ({rg}_{sigma }left(cdot right)) is a centered rolling average through a Gaussian window with standard deviation (sigma). The specific value of the window size has been chosen to mitigate weekly reporting effects. The values of the standard deviations of the Gaussian windows have been selected to achieve a smooth representation of the expected death estimation for each country as shown in the bottom panels of Supplementary Fig. S1.

Reporting delays

We consider an average delay of two days between reporting a death and its occurrence. This value is obtained by comparing the daily death counts reported for Spain<a data-track="click" data-track-action="reference anchor" data-track-label="link" data-test="citation-ref" aria-label="Reference 1" title="JHU CSSE COVID-19 Data, (2020).” href=”https://www.nature.com/articles/s41598-021-99273-1#ref-CR1″ id=”ref-link-section-d18334982e10381″>1 and their actual values<a data-track="click" data-track-action="reference anchor" data-track-label="link" data-test="citation-ref" aria-label="Reference 33" title="Ministerio de Sanidad. Fallecidos COVID-19, (2020).” href=”https://www.nature.com/articles/s41598-021-99273-1#ref-CR33″ id=”ref-link-section-d18334982e10385″>33 from February 15 to March 31, 2020. The values of the root-mean-squared deviation between reported and actual deaths shifted by 0, 1, 2, 3, and 4 days are 77.9, 58.4, 38.5, 58.7, and 88.6 deaths respectively.

Infection fatality rate ((IFR))

The infection fatality rate is computed assuming homogeneous attack rate as

$$begin{array}{c}IFR=frac{1}{{sum }_{a}{g}_{a}}{sum }_{a}{IFR}_{a}{g}_{a} ,end{array}$$

(39)

where ({mathrm{IFR}}_{a}) is the previously estimated (IFR) for the age group (a)5 and ({g}_{a}) is the population in the age group (a) as reported by the United Nations for countries<a data-track="click" data-track-action="reference anchor" data-track-label="link" data-test="citation-ref" aria-label="Reference 18" title="UN World Population Prospects: Total Population – Both Sexes, (2020).” href=”https://www.nature.com/articles/s41598-021-99273-1#ref-CR18″ id=”ref-link-section-d18334982e10614″>18 and the US Census for states<a data-track="click" data-track-action="reference anchor" data-track-label="link" data-test="citation-ref" aria-label="Reference 19" title="US Census: ACS Demographic and Housing Estimates 2018, (2019).” href=”https://www.nature.com/articles/s41598-021-99273-1#ref-CR19″ id=”ref-link-section-d18334982e10618″>19.

Clinical parameters

We obtained the values of the average ({tau }_{G}) and standard deviation ({sigma }_{G}) of the generation time from Ref.13, of the averages of the incubation ({tau }_{I}) and symptom onset-to-death ({tau }_{OD}) times from Refs.5,14, and of the average number of days (Delta {t}_{TP}) of positive testing by an infected individual from Refs.15,17. The average time at which an individual tested positive after infection ({tau }_{TP}) was computed as ({tau }_{TP}={tau }_{I}-2+Delta {t}_{TP}/2), where we have assumed that on average an individual started to test positive 2 days before symptom onset. The average seroconversion time after infection ({tau }_{SP}) was estimated as ({tau }_{I}) plus the 7 days of 50% seroconversion after symptom onset reported in Ref.16.

Dynamical constraints implementation with discrete time

We implemented the dynamical constraints to compute the infectious and infected population as outlined in the main text and as detailed in the previous section of this document, using days as time units. Time delays were rounded to days to assign daily values.

The first derivative of the cumulative number of deaths is computed as

$$begin{array}{c}frac{d{n}_{D}left(tright)}{dt}=Delta {n}_{D}left(tright),end{array}$$

(40)

with (Delta {n}_{D}left(tright)={n}_{D}left(tright)-{n}_{D}(t-1)).

The growth rate was computed explicitly from the discrete time series as the centered 7-day difference

$$begin{array}{c}{k}_{G}left(tright)=frac{1}{7}left({mathrm{ln}}left(Delta {n}_{D}left(t+4right)+Delta {n}_{D}left(t+3right)right)-{mathrm{ln}}left(Delta {n}_{D}left(t-3right)+Delta {n}_{D}left(t-4right)right)right).end{array}$$

(41)

The 7-day value was chosen to mitigate reporting artifacts.

Confidence and credibility intervals

Confidence intervals associated with death counts were computed using bootstrapping with 10,000 realizations34. These confidence intervals were combined with the credibility intervals of the (IFR) in infectious and infected populations assuming independence and additivity on a logarithmic scale.

Fold accuracy

The fold accuracy, ({F}_{A}), is explicitly computed as

$$begin{array}{c}{mathrm{log}}{F}_{A}=frac{1}{N}{sum }_{i=1}^{N}left|{mathrm{log}}{x}_{i}^{obs}-{mathrm{log}}{x}_{i}^{est}right|,end{array}$$

(42)

where (left|cdot right|) is the absolute value function, ({x}_{i}^{obs}) is the ({i}^{th}) observation, ({x}_{i}^{est}) is its corresponding estimation, and (N) is the total number of observations.

Inference and extrapolation

Because of the delay between infections and deaths, inference for the values of the growth rate and infectious populations ends on December 30, 2020 and for the values of the infected populations ends on December 26, 2020. Extrapolation to the current time (January 21, 2021) is carried out assuming the last growth rate computed.

Reproduction number

The quantities ({R}_{t}) and ({k}_{G}left(tright)) are related to each other through the Euler–Lotka equation, ({R}_{t}^{-1}={int }_{0}^{infty }{f}_{GT}left(tau right){e}^{-{k}_{G}left(tright)tau }dtau ,) which considers (jleft(t-tau right)simeq {e}^{-{k}_{G}left(tright)tau }jleft(tright)) in the renewal equation (jleft(tright)={int }_{0}^{infty }{k}_{I}left(t,tau right)jleft(t-tau right)dtau). Generation times can generally be described through a gamma distribution ({f}_{GT}left(tau right)=frac{{beta }^{alpha }}{Gamma left(alpha right)}{tau }^{alpha -1}{e}^{-beta tau }) with (alpha ={tau }_{G}^{2}/{sigma }_{G}^{2}) and (beta ={tau }_{G}/{sigma }_{G}^{2}), which leads to ({R}_{t}={left(1+{k}_{G}(t)/beta right)}^{alpha }) for ({k}_{G}(t)>-beta) and ({R}_{t}=0) for ({k}_{G}left(tright)le -beta). In the case of the exponentially distributed limit ((alpha simeq 1)) or small values of ({k}_{G}(t)/beta), it simplifies to ({R}_{t}=1+{k}_{G}left(tright){tau }_{G}) for ({k}_{G}left(tright)>-1/{tau }_{G}) and ({R}_{t}=0) for ({k}_{G}left(tright)le -1/{tau }_{G}). Global prevalence data were obtained from multiple data sources35,36,37,38,39,40,41,<a data-track="click" data-track-action="reference anchor" data-track-label="link" data-test="citation-ref" aria-label="Reference 42" title="European Centre for Disease Prevention and Control. Daily update of new reported cases of COVID-19 by country worldwide, (2020).” href=”https://www.nature.com/articles/s41598-021-99273-1#ref-CR42″ id=”ref-link-section-d18334982e12453″>42, as described in Supplementary Table S1.


Source: Ecology - nature.com

Phytoplankton biodiversity and the inverted paradox

Rover images confirm Jezero crater is an ancient Martian lake