Karsten T.
Hansen, James J. Heckman and Kathleen J. Mullen
Forthcoming in Journal of Econometrics
Note that AFQT component scores are
standardized to have within-sample mean 0, variance 1.
Rather than drop observations without information on parents’ education and family income, which would result in a loss of almost a quarter of our sample, we decided to impute values for the missing data. The imputation procedure was straightforward. For each variable with missing values (mother’s education, father’s education and family income):
(1) We ran an OLS regression of the nonmissing values on the following variables: dummy for southern residence at age 14, dummy for urban residence at age 14, dummy for broken home status at age 14, number of siblings, and year of birth dummies.
(2) We then used estimates from step 1 to compute predicted values for the missing data.
The program samples parameter values from an iterative Markov chain (see Appendix C in the paper) whose stationary distribution is the joint posterior distribution of the model parameters. The Fortran source code for the program can be downloaded here.
Note the data must be in a (tab or space) delimited text file with numerical values only.
C:\EXE_DIR\lfm_v4 C:\INPUT_DIR\input_file.txt n_draws n_burn n_skips
where
n_draws is number of draws from the Markov chain that will be recorded,
n_burn is the number of initial (burn-in) draws the program will discard, and
n_skips specifies that number of draws the program will skip.
For example:
C:\EXE_DIR\lfm_v4 C:\INPUT_DIR\input_file.txt 5000 10000 2
will instruct the program to sample 2*5000+10000=20,000 times from the Markov chain, discarding the first 10,000 draws and recording every other draw of the 10,000 sampled thereafter.
This program will estimate the joint posterior distribution of the parameters of the model. The estimates can then be used to produce, for example, the predicted AFQT distribution or conditional choice probabilities. Fortran and Matlab programs to produce the estimated posterior quantities discussed in the paper are available from the authors on request.