Skip to content

Commit

Permalink
situate package in data ecosystem
Browse files Browse the repository at this point in the history
  • Loading branch information
thegargiulian committed Oct 25, 2023
1 parent 67ea8d1 commit 1f871bb
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 2 deletions.
9 changes: 9 additions & 0 deletions paper.bib
Original file line number Diff line number Diff line change
Expand Up @@ -113,3 +113,12 @@ @article{lum2010
year={2010},
publisher={De Gruyter}
}

@misc{freire2019,
title={Deaths and Disappearances in the Pinochet Regime: A New Dataset},
DOI={10.31235/osf.io/vqnwu},
publisher={SocArXiv},
author={Freire, Danilo and Skarbek, David and Meadowcroft, John and Guerrero, Eugenia},
year={2019},
month={May}
}
5 changes: 3 additions & 2 deletions paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ tags:
- multiple imputation
- Colombia
- conflict
- social science

authors:
- name: Maria Gargiulo
Expand Down Expand Up @@ -51,7 +50,9 @@ The joint JEP-CEV-HRDAG project employed two statistical methods to address the

[^displacement]: While the join JEP-CEV-HRDAG project also examined forced displacement due to the armed conflict, we were unable to provide multiple systems estimation estimates of forced displacements because nearly all documented victims were registered on only one list, the *Registro Único de Víctimas*. As a result, we did not have sufficient overlap with other sources to construct estimates using multiple systems estimation, which generally requires three or more sources in the case of applications to human rights questions.

DANE has published 100 imputed replicate files with missing values filled in at the record level available for each of these four violations. This data format where there is no single file representing "the data" may be unfamiliar to researchers who have not worked with multiple imputation methods in the past and researchers may be tempted to select a single imputed replicate file to conduct their analyses rather than computing their analyses on multiple replicate files and combining the results using standard practices based on the laws of total expectation and total variance. The `verdata` package aims to support researchers in using the data from the Colombian Truth Commission responsibly and correctly despite the potential unfamiliarity of its structure. To complement the package, we have also created a [repository](https://github.com/HRDAG/verdata-examples) of examples of basic function use, replications of main findings from the technical appendix, and applications to other studies of interest not examined in the technical appendix. We have also published a series of pre-calculated estimates that researchers can opt to use to reduce the computational costs of multiple systems estimation. These pre-calculated estimates are available from the Colombian Truth Commission [website](http://comisiondelaverdad.co/analitica-de-datos-informacion-y-recursos#c3).
DANE has published 100 imputed replicate files with missing values filled in at the record level available for each of these four violations. This data format where there is no single file representing "the data" may be unfamiliar to researchers who have not worked with multiple imputation methods in the past and researchers may be tempted to select a single imputed replicate file to conduct their analyses rather than computing their analyses on multiple replicate files and combining the results using standard practices based on the laws of total expectation and total variance. The `verdata` package aims to support researchers in using the data from the Colombian Truth Commission responsibly and correctly despite the potential unfamiliarity of its structure. Software packages have not historically been created to facilitate the access and use of data published by past truth commissions. To date, the `pinochet` package [@freire2019], which facilitates access to data about killings and disappearances published in the Chilean Truth Commission, is the only other example of a software package created for this purpose.

To complement the package, we have also created a [repository](https://github.com/HRDAG/verdata-examples) of examples of basic function use, replications of main findings from the technical appendix, and applications to other studies of interest not examined in the technical appendix. We have also published a series of pre-calculated estimates that researchers can opt to use to reduce the computational costs of multiple systems estimation. These pre-calculated estimates are available from the Colombian Truth Commission [website](http://comisiondelaverdad.co/analitica-de-datos-informacion-y-recursos#c3).

We hope that `verdata` will play a role in expanding the use of statistical methods to address the two types of missing data in research on the conflict in Colombia, and armed conflicts more generally, so that the statistical biases apparent in individual data sources are not reproduced in future research on the conflict.

Expand Down

0 comments on commit 1f871bb

Please sign in to comment.