Inference from coarse data via multiple imputation with application to age heaping

Daniel F. Heitjan, Donald B. Rubin

Research output: Contribution to journalArticlepeer-review

113 Scopus citations

Abstract

Multiple imputation is applied to a demographic data set with coarse age measurements for Tanzanian children. The heaped ages are multiply imputed with plausible true ages using (a) a simple naive model and (b) a new, relatively complex model that relates true age to the observed values of heaped age, sex, and anthropometric variables. The imputed true ages are used to create valid inferences under the models and compare inferences across models, thereby revealing sensitivity of inferences to prior specifications, from naive to complex. In addition, diagnostic analyses applied to the imputed data are used to suggest which models appear most appropriate. Because it is not clear just what set of heaping intervals should be used, the models are applied under various assumptions about the heaping: Rounding (to the nearest year or half year) versus a combination of rounding and truncation as practiced in the United States, and medium versus wide heaping interval sizes. The most striking conclusions are the following: (a) inferences are very sensitive to the assumption of strict rounding versus rounding combined with truncation, yet judging from the diagnostics, the data cannot distinguish between such models; and (b) the diagnostics consistently favor the new, more complex model, which, although theoretically more satisfactory, can lead to inferences very similar to those obtained with the naive model. It is concluded that knowledge of the interval widths and heaping process sharpens valid inferences from data of this kind, and that given a specified process, simple and easily programmed multiple-imputation methods can lead to valid inferences.

Original languageEnglish (US)
Pages (from-to)304-314
Number of pages11
JournalJournal of the American Statistical Association
Volume85
Issue number410
DOIs
StatePublished - Jun 1990

Keywords

  • Bayesian methods
  • Grouped data
  • Heaped data
  • Incomplete data
  • Maximum likelihood
  • Missing data
  • Probit modeling
  • Rounded data
  • Selection modeling

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Inference from coarse data via multiple imputation with application to age heaping'. Together they form a unique fingerprint.

Cite this