Comparison of missing data approaches in linkage analysis.

Chao Xing, Fredrick R. Schumacher, David V. Conti, John S. Witte

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Observational cohort studies have been little used in linkage analyses due to their general lack of large, disease-specific pedigrees. Nevertheless, the longitudinal nature of such studies makes them potentially valuable for assessing the linkage between genotypes and temporal trends in phenotypes. The repeated phenotype measures in cohort studies (i.e., across time), however, can have extensive missing information. Existing methods for handling missing data in observational studies may decrease efficiency, introduce biases, and give spurious results. The impact of such methods when undertaking linkage analysis of cohort studies is unclear. Therefore, we compare here six methods of imputing missing repeated phenotypes on results from genome-wide linkage analyses of four quantitative traits from the Framingham Heart Study cohort. We found that simply deleting observations with missing values gave many more nominally statistically significant linkages than the other five approaches. Among the latter, those with similar underlying methodology (i.e., imputation- versus model-based) gave the most consistent results, although some discrepancies remained. Different methods for addressing missing values in linkage analyses of cohort studies can give substantially diverse results, and must be carefully considered to protect against biases and spurious findings.

Original languageEnglish (US)
Article numberS44
JournalBMC Genetics
Volume4 Suppl 1
StatePublished - 2003

Fingerprint

Cohort Studies
Phenotype
Observational Studies
Pedigree
Genotype
Genome

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Xing, C., Schumacher, F. R., Conti, D. V., & Witte, J. S. (2003). Comparison of missing data approaches in linkage analysis. BMC Genetics, 4 Suppl 1, [S44].

Comparison of missing data approaches in linkage analysis. / Xing, Chao; Schumacher, Fredrick R.; Conti, David V.; Witte, John S.

In: BMC Genetics, Vol. 4 Suppl 1, S44, 2003.

Research output: Contribution to journalArticle

Xing, C, Schumacher, FR, Conti, DV & Witte, JS 2003, 'Comparison of missing data approaches in linkage analysis.', BMC Genetics, vol. 4 Suppl 1, S44.
Xing C, Schumacher FR, Conti DV, Witte JS. Comparison of missing data approaches in linkage analysis. BMC Genetics. 2003;4 Suppl 1. S44.
Xing, Chao ; Schumacher, Fredrick R. ; Conti, David V. ; Witte, John S. / Comparison of missing data approaches in linkage analysis. In: BMC Genetics. 2003 ; Vol. 4 Suppl 1.
@article{3ec26f54ada746f28926f6f85862a424,
title = "Comparison of missing data approaches in linkage analysis.",
abstract = "Observational cohort studies have been little used in linkage analyses due to their general lack of large, disease-specific pedigrees. Nevertheless, the longitudinal nature of such studies makes them potentially valuable for assessing the linkage between genotypes and temporal trends in phenotypes. The repeated phenotype measures in cohort studies (i.e., across time), however, can have extensive missing information. Existing methods for handling missing data in observational studies may decrease efficiency, introduce biases, and give spurious results. The impact of such methods when undertaking linkage analysis of cohort studies is unclear. Therefore, we compare here six methods of imputing missing repeated phenotypes on results from genome-wide linkage analyses of four quantitative traits from the Framingham Heart Study cohort. We found that simply deleting observations with missing values gave many more nominally statistically significant linkages than the other five approaches. Among the latter, those with similar underlying methodology (i.e., imputation- versus model-based) gave the most consistent results, although some discrepancies remained. Different methods for addressing missing values in linkage analyses of cohort studies can give substantially diverse results, and must be carefully considered to protect against biases and spurious findings.",
author = "Chao Xing and Schumacher, {Fredrick R.} and Conti, {David V.} and Witte, {John S.}",
year = "2003",
language = "English (US)",
volume = "4 Suppl 1",
journal = "BMC Genetics",
issn = "1471-2156",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Comparison of missing data approaches in linkage analysis.

AU - Xing, Chao

AU - Schumacher, Fredrick R.

AU - Conti, David V.

AU - Witte, John S.

PY - 2003

Y1 - 2003

N2 - Observational cohort studies have been little used in linkage analyses due to their general lack of large, disease-specific pedigrees. Nevertheless, the longitudinal nature of such studies makes them potentially valuable for assessing the linkage between genotypes and temporal trends in phenotypes. The repeated phenotype measures in cohort studies (i.e., across time), however, can have extensive missing information. Existing methods for handling missing data in observational studies may decrease efficiency, introduce biases, and give spurious results. The impact of such methods when undertaking linkage analysis of cohort studies is unclear. Therefore, we compare here six methods of imputing missing repeated phenotypes on results from genome-wide linkage analyses of four quantitative traits from the Framingham Heart Study cohort. We found that simply deleting observations with missing values gave many more nominally statistically significant linkages than the other five approaches. Among the latter, those with similar underlying methodology (i.e., imputation- versus model-based) gave the most consistent results, although some discrepancies remained. Different methods for addressing missing values in linkage analyses of cohort studies can give substantially diverse results, and must be carefully considered to protect against biases and spurious findings.

AB - Observational cohort studies have been little used in linkage analyses due to their general lack of large, disease-specific pedigrees. Nevertheless, the longitudinal nature of such studies makes them potentially valuable for assessing the linkage between genotypes and temporal trends in phenotypes. The repeated phenotype measures in cohort studies (i.e., across time), however, can have extensive missing information. Existing methods for handling missing data in observational studies may decrease efficiency, introduce biases, and give spurious results. The impact of such methods when undertaking linkage analysis of cohort studies is unclear. Therefore, we compare here six methods of imputing missing repeated phenotypes on results from genome-wide linkage analyses of four quantitative traits from the Framingham Heart Study cohort. We found that simply deleting observations with missing values gave many more nominally statistically significant linkages than the other five approaches. Among the latter, those with similar underlying methodology (i.e., imputation- versus model-based) gave the most consistent results, although some discrepancies remained. Different methods for addressing missing values in linkage analyses of cohort studies can give substantially diverse results, and must be carefully considered to protect against biases and spurious findings.

UR - http://www.scopus.com/inward/record.url?scp=34248652911&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34248652911&partnerID=8YFLogxK

M3 - Article

C2 - 14975112

AN - SCOPUS:34248652911

VL - 4 Suppl 1

JO - BMC Genetics

JF - BMC Genetics

SN - 1471-2156

M1 - S44

ER -