-
Notifications
You must be signed in to change notification settings - Fork 1
Home
This code contains a complete replication for the paper "REDI for Binned Data: A Random Empirical Distribution Imputation method for estimating continuous incomes" by Molly M. King.
The files within the subfolder redi_package contain the Stata .ado file and help file for the redi package. To use the package for your own research, download the .ado file and .sthlp file and follow the instructions in the .sthlp file. You can ignore the rest of the repo.
The files within the subfolder redi_proof_of_concept contain the .do and .log files for the Sociological Methodology proof-of-concept paper. See below for details on replicating these results..
To replicate the REDI proof of concept, you must first download three datasets:
- Go to https://usa.ipums.org/.
- Select sample years 2016 and 2017. Select variables: YEAR, REPWTP, PERWT, HHINCOME. See how the selection should look here: https://osf.io/e2dtr/
- After unzipping, label the .dat file downloaded as ""usa_00008.dat".
- Go to https://cps.ipums.org/
- Download the variables: Select sample years 2016 and 2017. Select variables: YEAR, ASECWTH, HHINCOME, PERNUM, SEX, RACE, HISPAN, and EDUC. See how the selection should look here: https://osf.io/z3pcw/
- After unzipping, label the .dat file downloaded as "cps_00009.dat".
- Downloaded as a spreadsheet from the U.S. Bureau of Labor Statistics website: https://www.bls.gov/cpi/research-series/r-cpi-u-rs-home.htm.
- Code to import and use this data can be found along with the rest of the code.
Now you will create a file system to keep track of data and results from the replication.
-
Create an empty file for this project. Change the directory in the "redi00_master.do" to match this folder for code for the project:
global redi "YOUR / DIRECTORY / HERE" // where all replication .do files stored
-
Create a new blank folder to serve as an umbrella folder for holding all data related to this replication.
-
Create a new blank folder within this data folder. All source data files downloaded above should be placed in a single file folder. Change the directory in the "redi00_master.do" file to match this directory:
global source "YOUR / DIRECTORY / HERE" // original datasets (ACS, CPS ASEC)
-
Create another new blank folder within this data folder. This will be used for the clean but unmodified data sets extracted from these raw data sets. Change the directory in the "redi00_master.do" to match the folder on your computer:
global extr "YOUR / DIRECTORY / HERE" // extracted datasets - a file to extract original data
-
Create yet another new blank folder within your umbrella project folder. This will be used for modified data sets that you are actively working with. Again, change the directory in the "redi00_master.do" file:
global deriv "YOUR / DIRECTORY / HERE" // derived datasets - a file to store final data
-
Create one final folder to hold temporary data files. Change the directory in the "redi00_master.do" file:
global temp "YOUR / DIRECTORY / HERE" // temporary datasets - an empty file to store temporary data
Since these are global variables, you will not need to reestablish these directory connections at any other point in the code. You will, however, need to run this part of the code before running any other .do file in the project.
From this point forward, the code should be self-documenting. Proceed in the order laid out in the file "redi00_master.do".