Skip to content

Latest commit

 

History

History
316 lines (279 loc) · 12.4 KB

README.md

File metadata and controls

316 lines (279 loc) · 12.4 KB

Real-time estimation of the novel coronavirus incubation time

Updated: Thu Jan 30 16:39:49 2020

Our lab has been collecting data (freely available at data/nCoV-IDD-traveler-data.csv) on the exposure and symptom onset for novel coronavirus (nCoV-2019) cases that have been confirmed outside of the Hubei province. These cases have been confirmed either in other countries or in regions of China with no known local transmission. We search for news articles and reports in both English and Chinese and abstract the data necessary to estimate the incubation period of nCoV-2019. Two team members independently review the full text of each case report to ensure that data is correctly input. Discrepancies are resolved by discussion and consensus.

Quick links:

Data summary

There are 101 cases from 38 countries and provinces outside of Hubei, China. Of those 34 are known to be female (34%) and 63 are male (62%). The median age is about 52 years (IQR: 36.5-59). 29 cases are from Mainland China (29%), while 72 are from the rest of the world (71%). 61 cases presented with a fever (60%).

This figure displays the exposure and symptom onset windows for each case in our dataset, relative to the right-bound of the exposure window (ER). The blue bars indicate the the exposure windows and the red bars indicate the symptom onset windows for each case. Purple areas are where those two bars overlap.

This figure displays the exposure and symptom onset windows for each case in our dataset, relative to the right-bound of the exposure window (ER). The blue bars indicate the the exposure windows and the red bars indicate the symptom onset windows for each case. Purple areas are where those two bars overlap.

Exposure and symptom onset windows

The necessary components for estimating the incubation period are left and right bounds for the exposure (EL and ER) and symptom onset times (SE and SR) for each case. We use explicit dates and times when they are reported in the source documents, however when they are not available, we make the following assumptions:

  • For cases without a reported right-bound on symptom onset time (SR), we use the time that the case is first presented to a hospital or, lacking that, the time that the source document was published
  • For cases without an EL, we use 2019 December 1, which was the onset date for the first reported nCoV-2019 case; though we will test this assumption later
  • For cases without an ER, we use the SR
  • For cases without an SL, we use the EL

Under these assumptions, the median exposure interval was 49 (range: 1-58.8) and the median symptom onset interval was 1 (range: 0-58.8).

Incubation period estimates

We estimate the incubation period using the coarseDataTools package based on the paper by Reich et al, 2009. We assume a log-normal incubation period and using a bootstrap method for calculating confidence intervals.

The first model we fit is to all of the data and output the median, 2.5th, and 97.5th quantiles (and their confidence intervals):

est CIlow CIhigh
meanlog 1.644 1.495 1.798
sdlog 0.363 0.201 0.521
p2.5 2.542 1.829 3.564
p5 2.850 2.153 3.849
p25 4.052 3.411 4.859
p50 5.174 4.460 6.037
p75 6.608 5.474 8.062
p95 9.394 6.887 12.844
p97.5 10.531 7.381 15.051

The median incubation period lasts 5.174 days (CI: 4.46-6.037). The 2.5% of incubation periods pass in less than 2.542 days (CI: 1.829-3.564), while 97.5% of the population would experience symptoms by 10.531 days (CI: 7.381-15.051) since their exposure. The ‘meanlog’ and ‘sdlog’ estimates are the median and dispersion parameters for a LogNormal distribution; i.e. we recommend using a LogNormal(1.644, 0.363) distribution to appropriately represent the incubation time distribution.

Alternate estimates and sensitivity analyses

Alternate parameterizations

We fit other commonly-used parameterizations of the incubation period as comparisons to the log-normal distribution: gamma, Weibull, and Erlang.

The median estimates are very similar across parameterizations, while the Weibull distribution has a slightly smaller value at the 2.5th percentile and the log-normal distribution has a slightly larger value at the 97.5th percentile. The log-likelihoods were very similar between distributions; the log-normal distribution having the largest log-likelihood (62.05) and the Erlang distribution having the smallest log-likelihood (60.96).

The gamma distribution has an estimated shape parameter of 7.92 (95% CI: 3.97-24.98) and a scale parameter of 0.69 (95% CI: 0.21-1.52). The Weibull distribution has an estimated shape parameter of 3.11 (95% CI: 2.2-6.08) and a scale parameter of 6.11 (95% CI: 5.19-7.25). The Erlang distribution has an estimated shape parameter of 14 (95% CI: 5-21) and a scale parameter of 0.4 (95% CI: 0.26-1.11).

Sensitivity analyses

To make sure that our overall incubation estimates are sound, we ran a few analyses on subsets to see if the results held up. Since the winter often brings cold air and other pathogens that can cause sore throats and coughs, we ran an analysis using only cases that reported a fever. Since a plurality of our cases came from Mainland China, where assumptions about local transmission may be less firm, we ran an analysis without those cases. Finally, we challenge our assumption that unknown ELs can be assumed to be 2019 December 1 (Nextstrain estimates that it could have happened as early as September), by setting unknown ELs to 2018 December 1.

Using only fevers, the estimates are 0.377 to 0.854 days longer than the estimates on the full data. 8 of the cases with a fever reported having other symptoms beforehand. While it may take a little longer for an exposure to cause a fever, the estimates are similar to those of the overall results. The confidence intervals are wider here at every quantile due to having less data.

Using only cases from outside of Mainland China, the estimates are -0.078 to 1.92 days longer than the estimates on the full data. There is a bit of a gap on the long end of the tail, but the confidence intervals overlap for the most part.

When we set the unknown ELs to 2018 December 1 instead of 2019 December 1, the estimates are -0.002 to 0.366 days longer than the estimates on the full data. Somewhat surprisingly, this changes the estimates less than either of the other alternate estimates.

Comparison to other estimates

Backer, Klinkenberg, & Wallinga estimated the incubation period based on 34 early nCoV cases that traveled from Wuhan to other regions in China. Li et al estimated the incubation period based on the 10 laboratory-confirmed cases in Wuhan. A comparison of our incubation periods are shown below:

The median estimates from all models lie between 4.14 and 5.61. The lower and upper tails for our distributions are all closer to the median than from the other studies, whether this is due to differences in data or in estimation methodologies is open for investigation.

Parameter estimates

For the convenience of researchers who need parameter estimates for making infectious disease models, we include a table of the parameter estimates from our analysis and inferred from the other analyses. The parameters are different for each distribution; par1 and par2 are log-mean and log-sd of the log-normal distribution, while they are the shape and scale parameters for the gamma, Weibull, and Erlang distributions.

study type obs par1 par2
JHU-IDD log-normal 101 1.64 0.36
JHU-IDD gamma 101 7.92 0.69
JHU-IDD weibull 101 3.11 6.11
JHU-IDD erlang 101 14.00 0.40
Backer 2020 weibull 34 2.37 6.54
Backer 2020 gamma 34 4.96 1.15
Backer 2020 log-normal 34 1.63 0.55
Li 2020 log-normal 10 1.42 0.67

(Qulu Zheng, Hannah Meredith, Kyra Grantz, Qifang Bi, Forrest Jones, and Stephen Lauer all contributed to this project)