Skip to content

Commit

Permalink
final typo fix
Browse files Browse the repository at this point in the history
  • Loading branch information
thegargiulian committed Dec 12, 2023
1 parent 358a8c7 commit 9110b0e
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,11 @@ Collecting data on human rights abuses in conflict settings is a difficult and o

The data analyzed in the joint JEP-CEV-HRDAG project was no exception to this empirical reality and was subject to two types of missing data: missing fields and underreporting. Related to missing fields, some records were missing socio-demographic information about victims such as their age, sex, or ethnicity, information identifying armed groups thought to be responsible for the violence, or precise information about the date and location of a particular violent event. These gaps in the data pose challenges for analyses seeking to stratify the data based on any fields containing missing values. With respect to underreporting, some instances of violence were not documented by any of the databases we received, leaving some victims' stories untold [@amado2022]. Moreover, this missingness is unlikely to be randomly distributed among members of the victim population, meaning that inferences drawn from samples of documented victims alone could result in erroneous conclusions about patterns of violence.

The joint JEP-CEV-HRDAG project employed two statistical methods to address the two types of missingness. To address missing fields within records of documented victims, the project used the `R` package `mice` [@vanbuuren2011] to perform multiple imputation [e.g., @murray2018], probabilistically filling in missing values at the record level multiple times. Multiple systems estimation [e.g., @bird2018; @chao2001], performed on the imputed replicate files, was then used to estimate the number of missing observations, that is, the number of the victims never documented by any of the data sources used in the project.[^mse] To estimate the number of missing observations, we used a Bayesian latent class multiple-capture model [@manriquevallier2016] implemented in the `R` package `LCMCR`. The analyses presented in the technical appendix of the joint project combine these two methods to examine patterns of enforced disappearance, homicide, kidnapping, and forced recruitment of minors in the armed conflict.[^displacement]
The joint JEP-CEV-HRDAG project employed two statistical methods to address the two types of missingness. To address missing fields within records of documented victims, the project used the `R` package `mice` [@vanbuuren2011] to perform multiple imputation [e.g., @murray2018], probabilistically filling in missing values at the record level multiple times. Multiple systems estimation [e.g., @bird2018; @chao2001], performed on the imputed replicate files, was then used to estimate the number of missing observations, that is, the number of the victims never documented by any of the data sources used in the project.[^mse] To estimate the number of missing observations, we used a Bayesian latent class multiple capture-recapture model [@manriquevallier2016] implemented in the `R` package `LCMCR`. The analyses presented in the technical appendix of the joint project combine these two methods to examine patterns of enforced disappearance, homicide, kidnapping, and forced recruitment of minors in the armed conflict.[^displacement]

[^mse]: Multiple systems estimation is also called capture-recapture in some disciplines.

[^displacement]: While the join JEP-CEV-HRDAG project also examined forced displacement due to the armed conflict, we were unable to provide multiple systems estimation estimates of forced displacements because nearly all documented victims were registered on only one list, the *Registro Único de Víctimas*. As a result, we did not have sufficient overlap with other sources to construct estimates using multiple systems estimation, which generally requires three or more sources in the case of applications to human rights questions.
[^displacement]: While the joint JEP-CEV-HRDAG project also examined forced displacement due to the armed conflict, we were unable to provide multiple systems estimation estimates of forced displacements because nearly all documented victims were registered on only one list, the *Registro Único de Víctimas*. As a result, we did not have sufficient overlap with other sources to construct estimates using multiple systems estimation, which generally requires three or more sources in the case of applications to human rights questions.

DANE has published 100 imputed replicate files with missing values filled in at the record level available for each of these four violations. This data format where there is no single file representing "the data" may be unfamiliar to researchers who have not worked with multiple imputation methods in the past and researchers may be tempted to select a single imputed replicate file to conduct their analyses rather than computing their analyses on multiple replicate files and combining the results using standard practices based on the laws of total expectation and total variance. The `verdata` package aims to support researchers in using the data from the Colombian Truth Commission responsibly and correctly despite the potential unfamiliarity of its structure. Software packages have not historically been created to facilitate the access and use of data published by past truth commissions. To date, the `pinochet` package [@freire2019], which facilitates access to data about killings and disappearances published in the Chilean Truth Commission, is the only other example of a software package created for this purpose.

Expand Down

0 comments on commit 9110b0e

Please sign in to comment.