Toward improved statistical methods for analyzing Cotinine-Biomarker health association data

Tulay Koru-Sengul, John D. Clark, Lora E. Fleming, David J. Lee

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Background: Serum cotinine, a metabolite of nicotine, is frequently used in research as a biomarker of recent tobacco smoke exposure. Historically, secondhand smoke (SHS) research uses suboptimal statistical methods due to censored serum cotinine values, meaning a measurement below the limit of detection (LOD). Methods. We compared commonly used methods for analyzing censored serum cotinine data using parametric and non-parametric techniques employing data from the 1999-2004 National Health and Nutrition Examination Surveys (NHANES). To illustrate the differences in associations obtained by various analytic methods, we compared parameter estimates for the association between cotinine and the inflammatory marker homocysteine using complete case analysis, single and multiple imputation, "reverse" Kaplan-Meier, and logistic regression models. Results: Parameter estimates and statistical significance varied according to the statistical method used with censored serum cotinine values. Single imputation of censored values with either 0, LOD or LOD/2 yielded similar estimates and significance; multiple imputation method yielded smaller estimates than the other methods and without statistical significance. Multiple regression modelling using the "reverse" Kaplan-Meier method yielded statistically significant estimates that were larger than those from parametric methods. Conclusions: Analyses of serum cotinine data with values below the LOD require special attention. "Reverse" Kaplan-Meier was the only method inherently able to deal with censored data with multiple LODs, and may be the most accurate since it avoids data manipulation needed for use with other commonly used statistical methods. Additional research is needed into the identification of optimal statistical methods for analysis of SHS biomarkers subject to a LOD.

Original languageEnglish (US)
Article number11
JournalTobacco Induced Diseases
Volume9
Issue number1
DOIs
StatePublished - Oct 6 2011

Fingerprint

Cotinine
statistical method
Biomarkers
Health
health
Limit of Detection
statistical significance
nicotine
Values
Serum
Tobacco Smoke Pollution
regression
Logistic Models
Research
nutrition
manipulation
Nutrition Surveys
logistics
Homocysteine
Nicotine

ASJC Scopus subject areas

  • Health(social science)
  • Medicine (miscellaneous)
  • Public Health, Environmental and Occupational Health

Cite this

Toward improved statistical methods for analyzing Cotinine-Biomarker health association data. / Koru-Sengul, Tulay; Clark, John D.; Fleming, Lora E.; Lee, David J.

In: Tobacco Induced Diseases, Vol. 9, No. 1, 11, 06.10.2011.

Research output: Contribution to journalArticle

@article{4ee55ea445cd48c18f4065fef352b48b,
title = "Toward improved statistical methods for analyzing Cotinine-Biomarker health association data",
abstract = "Background: Serum cotinine, a metabolite of nicotine, is frequently used in research as a biomarker of recent tobacco smoke exposure. Historically, secondhand smoke (SHS) research uses suboptimal statistical methods due to censored serum cotinine values, meaning a measurement below the limit of detection (LOD). Methods. We compared commonly used methods for analyzing censored serum cotinine data using parametric and non-parametric techniques employing data from the 1999-2004 National Health and Nutrition Examination Surveys (NHANES). To illustrate the differences in associations obtained by various analytic methods, we compared parameter estimates for the association between cotinine and the inflammatory marker homocysteine using complete case analysis, single and multiple imputation, {"}reverse{"} Kaplan-Meier, and logistic regression models. Results: Parameter estimates and statistical significance varied according to the statistical method used with censored serum cotinine values. Single imputation of censored values with either 0, LOD or LOD/2 yielded similar estimates and significance; multiple imputation method yielded smaller estimates than the other methods and without statistical significance. Multiple regression modelling using the {"}reverse{"} Kaplan-Meier method yielded statistically significant estimates that were larger than those from parametric methods. Conclusions: Analyses of serum cotinine data with values below the LOD require special attention. {"}Reverse{"} Kaplan-Meier was the only method inherently able to deal with censored data with multiple LODs, and may be the most accurate since it avoids data manipulation needed for use with other commonly used statistical methods. Additional research is needed into the identification of optimal statistical methods for analysis of SHS biomarkers subject to a LOD.",
author = "Tulay Koru-Sengul and Clark, {John D.} and Fleming, {Lora E.} and Lee, {David J.}",
year = "2011",
month = "10",
day = "6",
doi = "10.1186/1617-9625-9-11",
language = "English (US)",
volume = "9",
journal = "Tobacco Induced Diseases",
issn = "1617-9625",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Toward improved statistical methods for analyzing Cotinine-Biomarker health association data

AU - Koru-Sengul, Tulay

AU - Clark, John D.

AU - Fleming, Lora E.

AU - Lee, David J.

PY - 2011/10/6

Y1 - 2011/10/6

N2 - Background: Serum cotinine, a metabolite of nicotine, is frequently used in research as a biomarker of recent tobacco smoke exposure. Historically, secondhand smoke (SHS) research uses suboptimal statistical methods due to censored serum cotinine values, meaning a measurement below the limit of detection (LOD). Methods. We compared commonly used methods for analyzing censored serum cotinine data using parametric and non-parametric techniques employing data from the 1999-2004 National Health and Nutrition Examination Surveys (NHANES). To illustrate the differences in associations obtained by various analytic methods, we compared parameter estimates for the association between cotinine and the inflammatory marker homocysteine using complete case analysis, single and multiple imputation, "reverse" Kaplan-Meier, and logistic regression models. Results: Parameter estimates and statistical significance varied according to the statistical method used with censored serum cotinine values. Single imputation of censored values with either 0, LOD or LOD/2 yielded similar estimates and significance; multiple imputation method yielded smaller estimates than the other methods and without statistical significance. Multiple regression modelling using the "reverse" Kaplan-Meier method yielded statistically significant estimates that were larger than those from parametric methods. Conclusions: Analyses of serum cotinine data with values below the LOD require special attention. "Reverse" Kaplan-Meier was the only method inherently able to deal with censored data with multiple LODs, and may be the most accurate since it avoids data manipulation needed for use with other commonly used statistical methods. Additional research is needed into the identification of optimal statistical methods for analysis of SHS biomarkers subject to a LOD.

AB - Background: Serum cotinine, a metabolite of nicotine, is frequently used in research as a biomarker of recent tobacco smoke exposure. Historically, secondhand smoke (SHS) research uses suboptimal statistical methods due to censored serum cotinine values, meaning a measurement below the limit of detection (LOD). Methods. We compared commonly used methods for analyzing censored serum cotinine data using parametric and non-parametric techniques employing data from the 1999-2004 National Health and Nutrition Examination Surveys (NHANES). To illustrate the differences in associations obtained by various analytic methods, we compared parameter estimates for the association between cotinine and the inflammatory marker homocysteine using complete case analysis, single and multiple imputation, "reverse" Kaplan-Meier, and logistic regression models. Results: Parameter estimates and statistical significance varied according to the statistical method used with censored serum cotinine values. Single imputation of censored values with either 0, LOD or LOD/2 yielded similar estimates and significance; multiple imputation method yielded smaller estimates than the other methods and without statistical significance. Multiple regression modelling using the "reverse" Kaplan-Meier method yielded statistically significant estimates that were larger than those from parametric methods. Conclusions: Analyses of serum cotinine data with values below the LOD require special attention. "Reverse" Kaplan-Meier was the only method inherently able to deal with censored data with multiple LODs, and may be the most accurate since it avoids data manipulation needed for use with other commonly used statistical methods. Additional research is needed into the identification of optimal statistical methods for analysis of SHS biomarkers subject to a LOD.

UR - http://www.scopus.com/inward/record.url?scp=80053442609&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80053442609&partnerID=8YFLogxK

U2 - 10.1186/1617-9625-9-11

DO - 10.1186/1617-9625-9-11

M3 - Article

C2 - 21968135

AN - SCOPUS:80053442609

VL - 9

JO - Tobacco Induced Diseases

JF - Tobacco Induced Diseases

SN - 1617-9625

IS - 1

M1 - 11

ER -