Health and survival transition data

This is the companion website for the paper "Health dynamics, life expectancy heterogeneity, and the racial gap in Social Security wealth" which hosts the health-to-health and survival transition probabilities estimated from the Health and Retirement Study (HRS) for the United States in CSV and Excel format.

Authors: Richard Foltyn, Jonna Olsson

Citation and license

The contents of this repository is licensed under a Creative Commons Attribution 4.0 International License.

If you are using the material in your research, please cite

Foltyn, Richard and Jonna Olsson: "Health dynamics, life expectancy heterogeneity, and the racial gap in Social Security wealth", 2024

You can download the citation in BibTeX format.

Health and survival probabilities

The directory Health-5 contains the estimates for the benchmark model using all five health states reported in the HRS.
The directory Health-3 contains the estimates for a model with a smaller set of only three health states, where we combine the first two ("excellent" and "very good") and the last two ("fair" and "poor") states.
The directory Health-2 contains the estimates for a model where the first three health states are merged into one group and the last two form the other group.

How to use the data

The CSV files contain the health-to-health transition and survival probabilities for individuals aged 50 to 99. The estimates for each demographic group are stored in a separate file.

The CSV files have the following format:

Each six lines correspond to an age-specific block, i.e. the first six lines are for age 50, the next six for age 51, etc.
Within each block, the first 5 lines correspond to the initial health state: (1) excellent, (2) very good, ..., (5) poor.
Each column corresponds to one outcome: the first 5 columns are health states, and the last column is the probability of dying.
The sixth line is present for completeness so that each age-specific transition matrix is 6-by-6. It represents the absorbing state of death.

Loading the data

Python

The easiest way to load the CSV files is to use the pandas library:

import pandas as pd

# Create DataFrame from CSV data
df = pd.read_csv('H5_trans_prob_age50-99_nonblack_male.csv', sep=',', index_col=['age', 'health'])

# print first 5 rows of DataFrame
df[:5]
             Health1   Health2   Health3   Health4   Health5     Death
age health                                                            
50  1       0.720009  0.241329  0.030531  0.005596  0.001696  0.000839
    2       0.104946  0.733775  0.146507  0.013232  0.000663  0.000877
    3       0.012944  0.179531  0.699388  0.095823  0.009584  0.002730
    4       0.005203  0.022331  0.228457  0.649878  0.083380  0.010752
    5       0.009192  0.003818  0.022781  0.224592  0.698203  0.041414

Alternatively, plain numpy also works:

import numpy as np

data = np.loadtxt('H5_trans_prob_age50-99_nonblack_male.csv', delimiter=',', skiprows=1)

# Transition probabilities
prob = np.ascontiguousarray(data[:, 2:])

# Age corresponding to each row in prob array
age = np.array(data[:, 0], dtype=int)

# Health state corresponding to each row in prob array
health = np.array(data[:, 1], dtype=int)

Transition probabilities at two-year horizons

The next four graphs show the two-year probabilities of transitioning between the five self-reported health states conditional on survival, as well as the survival probability for each initial health state and age. The model estimates annual probabilities from biennial HRS data, so when comparing the estimates to raw data, these need to be transformed to two-year horizons.

The estimation is performed separately by race and gender for the male/female and black/nonblack subpopulations.

Shaded areas represent bootstrapped 95% confidence intervals (not included in the data files).

Male/nonblack

Female/nonblack

Male/black

Female/black

Transition probabilities at one-year horizons

The next two graphs show the annual transition probabilities which correspond to the contents of the data files.

Male/nonblack and female/nonblack

Male/black and female/black

Empirical health distribution

To perform simulations using the above transition and survival probabilities, an initial health distribution is required.

We provide the empirical distribution over health at age 50-51 and age 70-71 observed in the HRS in the files Health-5/CSV/H5_dist_health.csv or Health-5/Excel/H5_dist_health.xlsx for black/nonblack and male/female groups.
These population shares are computed from the estimation sample using the respondent-level weights.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Health-2		Health-2
Health-3		Health-3
Health-5		Health-5
README.md		README.md
health-process.bib		health-process.bib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Health and survival transition data

Citation and license

Health and survival probabilities

How to use the data

Loading the data

Python

Transition probabilities at two-year horizons

Male/nonblack

Female/nonblack

Male/black

Female/black

Transition probabilities at one-year horizons

Male/nonblack and female/nonblack

Male/black and female/black

Empirical health distribution

About

Languages

richardfoltyn/health-process

Folders and files

Latest commit

History

Repository files navigation

Health and survival transition data

Citation and license

Health and survival probabilities

How to use the data

Loading the data

Python

Transition probabilities at two-year horizons

Male/nonblack

Female/nonblack

Male/black

Female/black

Transition probabilities at one-year horizons

Male/nonblack and female/nonblack

Male/black and female/black

Empirical health distribution

About

Resources

Stars

Watchers

Forks

Languages