This repository holds the data and analysis scripts used for the paper "Comparing human and model-based forecasts of COVID-19 in Germany and Poland". It is also still used to to create weekly forecast submissions from the epiforecasts team at the London School of Hygiene & Tropical Medicine to the German and Polish Forecast Hub.
Model-based forecasts, which have played an important role in shaping public policy throughout the COVID-19 pandemic, represent an implicit combination of model assumptions and the researcher’s subjective opinion. This work analyses and compares human opinion against model-derived insights to discern relative strengths and weaknesses of both approaches. We compared purely opinion-derived forecasts of cases and deaths from COVID-19 in Germany and Poland, elicited from researchers and volunteers, against predictions from two semi-mechanistic epidemiological models. We also compared these forecasts against an ensemble of model-based, but expert-tuned, forecasts, submitted to the German and Polish Forecast Hub by other research institutions. In addition, we examined the effects of our contributions to the performance of the Hub ensemble. We found aggregated crowd forecasts to outperform all other methods when predicting cases (Weighted Interval Score relative to the Hub ensemble: 0.89), but not when predicting deaths (rel. WIS 1.26). Crowd forecasts were noticeably more overconfident (55% and 75% coverage of the 90% prediction intervals for cases and deaths, respectively) than model-based predictions. Performance of the semi-mechanistic models was good short-term, but deteriorated quickly over time when assumptions were no longer met.
- full manuscript:
manuscript/manuscript.pdf
- analysis script to generate figures for the manuscript:
analysis/analysis.Rmd
All data used for the paper are included as data objects in the covid.german.forecasts
R package.
- The data for the paper were loaded and compiled using the script
data-raw/paper.R
- package data are stored in
data/
The following data are stored:
Name | Description |
---|---|
crowdforecast_data | Crowd forecast data used for the paper |
dailytruth_data | Daily truth data used for the paper |
ensemble_members | Models included in the official hub ensemble |
ensemble_models | Names of all ensemble models |
epitrend | Classification of the epidemic into falling, rising etc (not used) |
filtered_data | Pre-filtered data used for the paper (with death forecasts restricted to the period after December 14th 2020) |
forecast_dates | Forecast dates used for the paper |
locations | Location and Population Look Up for Germany and Poland |
prediction_data | Forecast data used for the paper |
regular_models | Names of all regular models |
truth_data | Truth data used for the paper |
unfiltered_data | Unfiltered version of the combined prediction and truth data used for the paper |
- you can load individual data by running e.g.
covid.german.forecasts::filtered_data
for the main data set.