The main output of this repository are a Graph and a NPI file. The Graph consists of exposure, vulnerability, mobility, and historical COVID data on subnational and national level. The Graph and NPI file can be used as input for the COVID projection model which has been developed in partnership between UN OCHA and the Johns Hopkins University Applied Physics Laboratory (JHU/APL). The parametrization has been tested for six countries, of which the required files will also be downloaded during the setup. These countries are Afghanistan, the Democratic Republic of Congo, Iraq, Somalia, Sudan and South Sudan. Simultaneously, we encourage users to add their own countries of interest with the given instructions.
The methodology and reasoning behind the parameterization implemented in this repository can be found here.
If you have any questions or feedback, contact us at [email protected].
Install all packages from requirements.txt
.
pip install -r requirements.txt
If using Anaconda, set-up an environment and install the packages from environment.yml
.
conda env create --file environment.yml --name covid_param
conda activate covid_param
The configuration for six countries has been implemented in this repository (Afghanistan, the Democratic Republic of Congo, Iraq, Somalia, Sudan and South Sudan).
To download the data of these six countries and produce the outputs run
make setup
This will run all individual components, to create the outputs. Due to the large files that will be downloaded this may take some time. Alternatively, you can execute the Running step of each individual component to get the outputs of one country.
Execute these steps to update the NPI and graph output files at any time after you have done the initial setup. This will be a lot faster than make setup
since most data has already been downloaded.
- Run
make update_npi
, this will download the latest NPI information and write that to the output - Triage the resulting NPIs:
- Copy and paste the output Excel file to the Google sheet
- Indicate if any of the new NPIs can be modelled, and if so, if they should be included in the final input
- For any new measures that are to be included, fill in the Bucky measurement type, affected pcodes, and and compliance level
- Run
make update
, this will run all individual components to create the Graph and NPI file
- Make a new country config file in
config/
, start by usingconfig/template.yml
as a guide - Go through each of the Setup and Running steps for the individual components
- Download the admin level 2 country boundaries shapefile from HDX and place in
Inputs/$COUNTRY_ISO3/Shapefile/
. - Unzip the contents into a directory with the same name as the shapefile, and add this name to the config file
under the
admin
section - Also add the language suffix of the primary region name (e.g. EN)
- Commit the shapefile to the repository.
The first time you run, execute:
python Generate_SADD_exposure_from_tiff.py [Country ISO code] -d
The -d
flag is for downloading the WolrdPop files (they are large).
- Check GHS for the grid square numbers that cover the country
and add these to the config file under the
ghs
section. It helps to double check that the grid squares fully cover the country by displaying the admin regions file over the GHS rasters using QGIS. - Download food security data from IPC:
- Select the country from the dropdown menu, use the date slider to select only data from 2020, and export using the "Excel" button
- Save the excel file to
Inputs/$COUNTRY_ISO3/IPC
- Add the filename, last row number, and admin level to the config file in the
ipc
section - In the
replace_dict
, add any region names that have a different format than in the admin regions file (the script will also warn you about any mismatches so you can fill this part in iteratively) - Commit the Excel file to the repository
- If available, add the following to the config file:
- solid fuels
- raised blood pressure
- diabetes (from WHO or GHDX)
- handwashing facilities
- smoking
Make sure you have successfully run the exposure script for the country.
To run, execute:
python Generate_vulnerability_file.py [Country ISO code] -d
The -d
flag is for downloading and mosaicing the GHS data the first time you run.
- Add the car ownership fraction from WHO and household size from the UN
- If either or both sources are missing, set the household size to 1 and the car ownership fraction to some reasonable final maximum value (like 0.2)
Make sure you have downloaded the country shapefile as described in the Exposure step.
To run for the first time, execute:
python Generate_mobility_matrix [Country ISO code]
The script automatically caches the distances between regions, and the road intersection information.
Since the shapefile should rarely be updated you can usually run using the cached distances by using the -d
flag:
python Generate_mobility_matrix [Country ISO code] -d
HOTOSM occasionally refreshes the roads file, so it's good to update it every so often. However,
if you're in a hurry you can also run with the cached road intersections using the -c
flag:
python Generate_mobility_matrix [Country ISO code] -d -c
- Find the COVID-19 dataset file for the country on HDX, and add the URL for the direct download to the config file under
covid:url
- Set the different parameters according to the description in the config file template
- Make sure the
replace_dict
field is accurate to match the admin names in the covid file and in the exposure file
To run, execute:
python Generate_COVID_file.py [Country ISO code] -d
The -d
flag is for downloading the latest COVID data.
A common warning is given when the admin names in the covid file don't match teh exposure. The following warning is printed on the terminal: missing PCODE for the following admin units:
and the list is provided. To fix that, add the missing units in the replace_dict
dictionary in the config file.
The graph collects the COVID-19 case data, mobility data, contact matrix, population data, and vulnerability data into a single file.
- Under the
contact_matrix
section of the config file, add the country name of the country used for the contact matrix, and whether it falls alphabetically in file 1 (Albania to Morocco) or file 2 (Mozambique to Zimbabwe)
To run, execute:
python Generate_graph.py [Country ISO code]
- Run first with the
-u
flag to create the Excel file (and with the-d
flag to get the latest ACAPS file) - Copy and paste the contents into a new Google sheet, and publish it as a csv
- Add the URL of the published sheet to the config file under
NPIs
- Run with the
-f
flag to create the final csv file for bucky, and commit this file to the repository
Make sure you have downloaded the country shapefile as described in the Exposure step.
There are two modes to run the NPI script:
--update-npi-list
or-u
: This mode uses the local ACAPS data file (use the-d
flag do download the latest version), and creates an Excel file (located inInputs/[Country ISO code]/NPIs/[Country ISO code]_NPIs_input.xlsx
) for each country containing the ACAPS measures and, if it exists, any additional parameters / measures from the country's Google sheet- After running, you should copy and paste the cells of the Excel file to the 'Published' tab on the country's Google sheet, which should then be triaged
--create-final-list
orf
: Generate a csv file of NPI results to be read in by Bucky
To run in update mode:
python Generate_NPIs.py [Country ISO code] -u -d
The -d
flag is for downloading the latest ACAPS data.
To run in final file creation mode:
python Generate_NPIs.py [Country ISO code] -f
This script checks the two final output files (the graph and NPIs) for any missing or unexpected values.
Make sure you have generated the graph and NPI files.
python Check_output_quality.py [Country ISO code]
To do an extra-thorough check and run with warnings, use the -w
flag:
python Check_output_quality.py [Country ISO code] -w