From 1f871bb531361a470db2612bd7612ef49c22bca8 Mon Sep 17 00:00:00 2001 From: Maria Gargiulo Date: Wed, 25 Oct 2023 15:06:52 +0200 Subject: [PATCH] situate package in data ecosystem --- paper.bib | 9 +++++++++ paper.md | 5 +++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/paper.bib b/paper.bib index 9883e18..df94357 100644 --- a/paper.bib +++ b/paper.bib @@ -113,3 +113,12 @@ @article{lum2010 year={2010}, publisher={De Gruyter} } + +@misc{freire2019, + title={Deaths and Disappearances in the Pinochet Regime: A New Dataset}, + DOI={10.31235/osf.io/vqnwu}, + publisher={SocArXiv}, + author={Freire, Danilo and Skarbek, David and Meadowcroft, John and Guerrero, Eugenia}, + year={2019}, + month={May} +} diff --git a/paper.md b/paper.md index 4cb3b77..0978c2e 100644 --- a/paper.md +++ b/paper.md @@ -8,7 +8,6 @@ tags: - multiple imputation - Colombia - conflict - - social science authors: - name: Maria Gargiulo @@ -51,7 +50,9 @@ The joint JEP-CEV-HRDAG project employed two statistical methods to address the [^displacement]: While the join JEP-CEV-HRDAG project also examined forced displacement due to the armed conflict, we were unable to provide multiple systems estimation estimates of forced displacements because nearly all documented victims were registered on only one list, the *Registro Único de Víctimas*. As a result, we did not have sufficient overlap with other sources to construct estimates using multiple systems estimation, which generally requires three or more sources in the case of applications to human rights questions. -DANE has published 100 imputed replicate files with missing values filled in at the record level available for each of these four violations. This data format where there is no single file representing "the data" may be unfamiliar to researchers who have not worked with multiple imputation methods in the past and researchers may be tempted to select a single imputed replicate file to conduct their analyses rather than computing their analyses on multiple replicate files and combining the results using standard practices based on the laws of total expectation and total variance. The `verdata` package aims to support researchers in using the data from the Colombian Truth Commission responsibly and correctly despite the potential unfamiliarity of its structure. To complement the package, we have also created a [repository](https://github.com/HRDAG/verdata-examples) of examples of basic function use, replications of main findings from the technical appendix, and applications to other studies of interest not examined in the technical appendix. We have also published a series of pre-calculated estimates that researchers can opt to use to reduce the computational costs of multiple systems estimation. These pre-calculated estimates are available from the Colombian Truth Commission [website](http://comisiondelaverdad.co/analitica-de-datos-informacion-y-recursos#c3). +DANE has published 100 imputed replicate files with missing values filled in at the record level available for each of these four violations. This data format where there is no single file representing "the data" may be unfamiliar to researchers who have not worked with multiple imputation methods in the past and researchers may be tempted to select a single imputed replicate file to conduct their analyses rather than computing their analyses on multiple replicate files and combining the results using standard practices based on the laws of total expectation and total variance. The `verdata` package aims to support researchers in using the data from the Colombian Truth Commission responsibly and correctly despite the potential unfamiliarity of its structure. Software packages have not historically been created to facilitate the access and use of data published by past truth commissions. To date, the `pinochet` package [@freire2019], which facilitates access to data about killings and disappearances published in the Chilean Truth Commission, is the only other example of a software package created for this purpose. + +To complement the package, we have also created a [repository](https://github.com/HRDAG/verdata-examples) of examples of basic function use, replications of main findings from the technical appendix, and applications to other studies of interest not examined in the technical appendix. We have also published a series of pre-calculated estimates that researchers can opt to use to reduce the computational costs of multiple systems estimation. These pre-calculated estimates are available from the Colombian Truth Commission [website](http://comisiondelaverdad.co/analitica-de-datos-informacion-y-recursos#c3). We hope that `verdata` will play a role in expanding the use of statistical methods to address the two types of missing data in research on the conflict in Colombia, and armed conflicts more generally, so that the statistical biases apparent in individual data sources are not reproduced in future research on the conflict.