Decomposing Pearson's χ2 test: A linear regression and its departure from linearity

Zhengyang Zhou, Hung Chih Ku, Guan Xing, Chao Xing

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

In case-control genetic association studies, a standard practice is to perform the Cochran-Armitage (CA) trend test under the assumption of the additive model because of its robustness. We could even identify situations in which it outperformed the analysis model consistent with the underlying inheritance mode. In this article, we analytically reveal the statistical basis that leads to the phenomenon. By elucidating the origin of the CA trend test as a linear regression model, we decompose Pearson's χ2-test statistic into two components-one is the CA trend test statistic that measures the goodness of fit of the linear regression model, and the other measures the discrepancy between data and the linear regression model. Under this framework, we show that the additive coding scheme, as well as the multiplicative coding scheme, increases the coefficient of determination of the regression model by increasing the spread of data points. We also obtain the conditions under which the CA trend test statistic equals the MAX statistic and Pearson's χ2-test statistic.

Original languageEnglish (US)
JournalAnnals of Human Genetics
DOIs
StateAccepted/In press - Jan 1 2018

Fingerprint

Linear Models
Genetic Association Studies

Keywords

  • Linear regression
  • Ordinary least squares
  • Pearson's chi-squared test
  • Trend test

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Decomposing Pearson's χ2 test : A linear regression and its departure from linearity. / Zhou, Zhengyang; Ku, Hung Chih; Xing, Guan; Xing, Chao.

In: Annals of Human Genetics, 01.01.2018.

Research output: Contribution to journalArticle

@article{3ee2cfb5e284420faac3269b603e3938,
title = "Decomposing Pearson's χ2 test: A linear regression and its departure from linearity",
abstract = "In case-control genetic association studies, a standard practice is to perform the Cochran-Armitage (CA) trend test under the assumption of the additive model because of its robustness. We could even identify situations in which it outperformed the analysis model consistent with the underlying inheritance mode. In this article, we analytically reveal the statistical basis that leads to the phenomenon. By elucidating the origin of the CA trend test as a linear regression model, we decompose Pearson's χ2-test statistic into two components-one is the CA trend test statistic that measures the goodness of fit of the linear regression model, and the other measures the discrepancy between data and the linear regression model. Under this framework, we show that the additive coding scheme, as well as the multiplicative coding scheme, increases the coefficient of determination of the regression model by increasing the spread of data points. We also obtain the conditions under which the CA trend test statistic equals the MAX statistic and Pearson's χ2-test statistic.",
keywords = "Linear regression, Ordinary least squares, Pearson's chi-squared test, Trend test",
author = "Zhengyang Zhou and Ku, {Hung Chih} and Guan Xing and Chao Xing",
year = "2018",
month = "1",
day = "1",
doi = "10.1111/ahg.12257",
language = "English (US)",
journal = "Annals of Human Genetics",
issn = "0003-4800",
publisher = "Wiley-Blackwell",

}

TY - JOUR

T1 - Decomposing Pearson's χ2 test

T2 - A linear regression and its departure from linearity

AU - Zhou, Zhengyang

AU - Ku, Hung Chih

AU - Xing, Guan

AU - Xing, Chao

PY - 2018/1/1

Y1 - 2018/1/1

N2 - In case-control genetic association studies, a standard practice is to perform the Cochran-Armitage (CA) trend test under the assumption of the additive model because of its robustness. We could even identify situations in which it outperformed the analysis model consistent with the underlying inheritance mode. In this article, we analytically reveal the statistical basis that leads to the phenomenon. By elucidating the origin of the CA trend test as a linear regression model, we decompose Pearson's χ2-test statistic into two components-one is the CA trend test statistic that measures the goodness of fit of the linear regression model, and the other measures the discrepancy between data and the linear regression model. Under this framework, we show that the additive coding scheme, as well as the multiplicative coding scheme, increases the coefficient of determination of the regression model by increasing the spread of data points. We also obtain the conditions under which the CA trend test statistic equals the MAX statistic and Pearson's χ2-test statistic.

AB - In case-control genetic association studies, a standard practice is to perform the Cochran-Armitage (CA) trend test under the assumption of the additive model because of its robustness. We could even identify situations in which it outperformed the analysis model consistent with the underlying inheritance mode. In this article, we analytically reveal the statistical basis that leads to the phenomenon. By elucidating the origin of the CA trend test as a linear regression model, we decompose Pearson's χ2-test statistic into two components-one is the CA trend test statistic that measures the goodness of fit of the linear regression model, and the other measures the discrepancy between data and the linear regression model. Under this framework, we show that the additive coding scheme, as well as the multiplicative coding scheme, increases the coefficient of determination of the regression model by increasing the spread of data points. We also obtain the conditions under which the CA trend test statistic equals the MAX statistic and Pearson's χ2-test statistic.

KW - Linear regression

KW - Ordinary least squares

KW - Pearson's chi-squared test

KW - Trend test

UR - http://www.scopus.com/inward/record.url?scp=85047808412&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85047808412&partnerID=8YFLogxK

U2 - 10.1111/ahg.12257

DO - 10.1111/ahg.12257

M3 - Article

C2 - 29851025

AN - SCOPUS:85047808412

JO - Annals of Human Genetics

JF - Annals of Human Genetics

SN - 0003-4800

ER -