Modeling heaping in self-reported cigarette counts

Research output: Contribution to journalArticle

41 Citations (Scopus)

Abstract

In studies of smoking behavior, some subjects report exact cigarette counts, whereas others report rounded-off counts, particularly multiples of 20, 10 or 5. This form of data reporting error, known as heaping, can bias the estimation of parameters of interest such as mean cigarette consumption. We present a model to describe heaped count data from a randomized trial of bupropion treatment for smoking cessation. The model posits that the reported cigarette count is a deterministic function of an underlying precise cigarette count variable and a heaping behavior variable, both of which are at best partially observed. To account for an excess of zeros, as would likely occur in a smoking cessation study where some subjects successfully quit, we model the underlying count variable with zero-inflated count distributions. We study the sensitivity of the inference on smoking cessation by fitting various models that either do or do not account for heaping and zero inflation, comparing the models by means of Bayes factors. Our results suggest that sufficiently rich models for both the underlying distribution and the heaping behavior are indispensable to obtaining a good fit with heaped smoking data. The analyses moreover reveal that bupropion has a significant effect on the fraction abstinent, but not on mean cigarette consumption among the non-abstinent.

Original languageEnglish (US)
Pages (from-to)3789-3804
Number of pages16
JournalStatistics in Medicine
Volume27
Issue number19
DOIs
StatePublished - Aug 30 2008

Fingerprint

Tobacco Products
Smoking
Count
Smoking Cessation
Bupropion
Modeling
Multiple of ten
Zero-inflation
Economic Inflation
Randomized Trial
Bayes Factor
Model
Count Data
Model Fitting
Zero
Research Design
Excess
Likely

Keywords

  • Bayesian inference
  • Heaped data
  • Rounded data
  • Smoking cessation
  • Zero-inflated negative binomial
  • Zero-inflated poisson

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Cite this

Modeling heaping in self-reported cigarette counts. / Wang, Hao; Heitjan, Daniel F.

In: Statistics in Medicine, Vol. 27, No. 19, 30.08.2008, p. 3789-3804.

Research output: Contribution to journalArticle

@article{2ff5dbb97027491a959597bdf22ed45b,
title = "Modeling heaping in self-reported cigarette counts",
abstract = "In studies of smoking behavior, some subjects report exact cigarette counts, whereas others report rounded-off counts, particularly multiples of 20, 10 or 5. This form of data reporting error, known as heaping, can bias the estimation of parameters of interest such as mean cigarette consumption. We present a model to describe heaped count data from a randomized trial of bupropion treatment for smoking cessation. The model posits that the reported cigarette count is a deterministic function of an underlying precise cigarette count variable and a heaping behavior variable, both of which are at best partially observed. To account for an excess of zeros, as would likely occur in a smoking cessation study where some subjects successfully quit, we model the underlying count variable with zero-inflated count distributions. We study the sensitivity of the inference on smoking cessation by fitting various models that either do or do not account for heaping and zero inflation, comparing the models by means of Bayes factors. Our results suggest that sufficiently rich models for both the underlying distribution and the heaping behavior are indispensable to obtaining a good fit with heaped smoking data. The analyses moreover reveal that bupropion has a significant effect on the fraction abstinent, but not on mean cigarette consumption among the non-abstinent.",
keywords = "Bayesian inference, Heaped data, Rounded data, Smoking cessation, Zero-inflated negative binomial, Zero-inflated poisson",
author = "Hao Wang and Heitjan, {Daniel F.}",
year = "2008",
month = "8",
day = "30",
doi = "10.1002/sim.3281",
language = "English (US)",
volume = "27",
pages = "3789--3804",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "19",

}

TY - JOUR

T1 - Modeling heaping in self-reported cigarette counts

AU - Wang, Hao

AU - Heitjan, Daniel F.

PY - 2008/8/30

Y1 - 2008/8/30

N2 - In studies of smoking behavior, some subjects report exact cigarette counts, whereas others report rounded-off counts, particularly multiples of 20, 10 or 5. This form of data reporting error, known as heaping, can bias the estimation of parameters of interest such as mean cigarette consumption. We present a model to describe heaped count data from a randomized trial of bupropion treatment for smoking cessation. The model posits that the reported cigarette count is a deterministic function of an underlying precise cigarette count variable and a heaping behavior variable, both of which are at best partially observed. To account for an excess of zeros, as would likely occur in a smoking cessation study where some subjects successfully quit, we model the underlying count variable with zero-inflated count distributions. We study the sensitivity of the inference on smoking cessation by fitting various models that either do or do not account for heaping and zero inflation, comparing the models by means of Bayes factors. Our results suggest that sufficiently rich models for both the underlying distribution and the heaping behavior are indispensable to obtaining a good fit with heaped smoking data. The analyses moreover reveal that bupropion has a significant effect on the fraction abstinent, but not on mean cigarette consumption among the non-abstinent.

AB - In studies of smoking behavior, some subjects report exact cigarette counts, whereas others report rounded-off counts, particularly multiples of 20, 10 or 5. This form of data reporting error, known as heaping, can bias the estimation of parameters of interest such as mean cigarette consumption. We present a model to describe heaped count data from a randomized trial of bupropion treatment for smoking cessation. The model posits that the reported cigarette count is a deterministic function of an underlying precise cigarette count variable and a heaping behavior variable, both of which are at best partially observed. To account for an excess of zeros, as would likely occur in a smoking cessation study where some subjects successfully quit, we model the underlying count variable with zero-inflated count distributions. We study the sensitivity of the inference on smoking cessation by fitting various models that either do or do not account for heaping and zero inflation, comparing the models by means of Bayes factors. Our results suggest that sufficiently rich models for both the underlying distribution and the heaping behavior are indispensable to obtaining a good fit with heaped smoking data. The analyses moreover reveal that bupropion has a significant effect on the fraction abstinent, but not on mean cigarette consumption among the non-abstinent.

KW - Bayesian inference

KW - Heaped data

KW - Rounded data

KW - Smoking cessation

KW - Zero-inflated negative binomial

KW - Zero-inflated poisson

UR - http://www.scopus.com/inward/record.url?scp=50449091206&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=50449091206&partnerID=8YFLogxK

U2 - 10.1002/sim.3281

DO - 10.1002/sim.3281

M3 - Article

C2 - 18407584

AN - SCOPUS:50449091206

VL - 27

SP - 3789

EP - 3804

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 19

ER -