Application of multidimensional selective item response regression model for studying multiple gene methylation in SV40 oncogenic pathways

Haiqun Lin, Ziding Feng, Yan Yu, Yingye Zheng, Narayan Shivapurkar, Adi F. Gazdar

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Alteration of gene methylation patterns has been reported to be involved in the early onsets of many human malignancies. Many exogenous risk factors, such as cigarette smoke, dietary additives, chemical exposures, radiation, and biologic agents including viral infection, are involved in the methylation pathways of cancers. We propose a multidimensional selective item response regression model to describe and test how a risk factor may alter molecular pathways involving aberrant methylation of multiple genes in oncogenesis. Our modeling framework is built on an item response model for multivariate dichotomous responses of high dimension, such as aberrant methylation of multiple tumor-suppressor genes, but we allow risk factors such as SV40 viral infection to alter the distribution of the latent factors that subsequently affect the outcome of cancer. We postulate empirical identification conditions under our model formulation. Moreover, we do not prespecify the links between the multiple dichotomous methylation responses and the latent factors, but rather conduct specification searches with a genetic algorithm to discover the links. Parameter estimation through maximum likelihood and specification searches in models with multidimensional latent factors for multivariate binary responses have become practical only recently, due to modern statistical computing development. We illustrate our proposal with the biological finding that simultaneous methylation of multiple tumor-suppressor genes is associated with the presence of SV40 viral sequences and with the cancer status of lymphoma/leukemia. We are able to test whether the data are consistent with the causal hypothesis that SV40 induces aberrant methylation of multiple genes in its oncogenic pathways. At the same time, we are able to evaluate the role of SV40 in the methylation pathway and to determine whether the methylation pathway is responsible for the development of leukemia/lymphoma.

Original languageEnglish (US)
Pages (from-to)201-211
Number of pages11
JournalJournal of the American Statistical Association
Volume103
Issue number481
DOIs
StatePublished - Mar 2008

Fingerprint

Pathway
Regression Model
Gene
Risk Factors
Multivariate Response
Cancer
Leukemia
Infection
Tumor
Specification
Statistical Computing
Binary Response
Postulate
Higher Dimensions
Maximum Likelihood
Parameter Estimation
Radiation
Regression model
Genetic Algorithm
Model

Keywords

  • Biomarker
  • Causal pathway
  • Factor analysis
  • Genetic algorithm
  • Identification
  • Item response
  • Joint model
  • Latent variable
  • Specification search

ASJC Scopus subject areas

  • Mathematics(all)
  • Statistics and Probability

Cite this

Application of multidimensional selective item response regression model for studying multiple gene methylation in SV40 oncogenic pathways. / Lin, Haiqun; Feng, Ziding; Yu, Yan; Zheng, Yingye; Shivapurkar, Narayan; Gazdar, Adi F.

In: Journal of the American Statistical Association, Vol. 103, No. 481, 03.2008, p. 201-211.

Research output: Contribution to journalArticle

@article{426b0a194f3241c8946dc1231eaefbe3,
title = "Application of multidimensional selective item response regression model for studying multiple gene methylation in SV40 oncogenic pathways",
abstract = "Alteration of gene methylation patterns has been reported to be involved in the early onsets of many human malignancies. Many exogenous risk factors, such as cigarette smoke, dietary additives, chemical exposures, radiation, and biologic agents including viral infection, are involved in the methylation pathways of cancers. We propose a multidimensional selective item response regression model to describe and test how a risk factor may alter molecular pathways involving aberrant methylation of multiple genes in oncogenesis. Our modeling framework is built on an item response model for multivariate dichotomous responses of high dimension, such as aberrant methylation of multiple tumor-suppressor genes, but we allow risk factors such as SV40 viral infection to alter the distribution of the latent factors that subsequently affect the outcome of cancer. We postulate empirical identification conditions under our model formulation. Moreover, we do not prespecify the links between the multiple dichotomous methylation responses and the latent factors, but rather conduct specification searches with a genetic algorithm to discover the links. Parameter estimation through maximum likelihood and specification searches in models with multidimensional latent factors for multivariate binary responses have become practical only recently, due to modern statistical computing development. We illustrate our proposal with the biological finding that simultaneous methylation of multiple tumor-suppressor genes is associated with the presence of SV40 viral sequences and with the cancer status of lymphoma/leukemia. We are able to test whether the data are consistent with the causal hypothesis that SV40 induces aberrant methylation of multiple genes in its oncogenic pathways. At the same time, we are able to evaluate the role of SV40 in the methylation pathway and to determine whether the methylation pathway is responsible for the development of leukemia/lymphoma.",
keywords = "Biomarker, Causal pathway, Factor analysis, Genetic algorithm, Identification, Item response, Joint model, Latent variable, Specification search",
author = "Haiqun Lin and Ziding Feng and Yan Yu and Yingye Zheng and Narayan Shivapurkar and Gazdar, {Adi F.}",
year = "2008",
month = "3",
doi = "10.1198/016214507000000428",
language = "English (US)",
volume = "103",
pages = "201--211",
journal = "Journal of the American Statistical Association",
issn = "0162-1459",
publisher = "Taylor and Francis Ltd.",
number = "481",

}

TY - JOUR

T1 - Application of multidimensional selective item response regression model for studying multiple gene methylation in SV40 oncogenic pathways

AU - Lin, Haiqun

AU - Feng, Ziding

AU - Yu, Yan

AU - Zheng, Yingye

AU - Shivapurkar, Narayan

AU - Gazdar, Adi F.

PY - 2008/3

Y1 - 2008/3

N2 - Alteration of gene methylation patterns has been reported to be involved in the early onsets of many human malignancies. Many exogenous risk factors, such as cigarette smoke, dietary additives, chemical exposures, radiation, and biologic agents including viral infection, are involved in the methylation pathways of cancers. We propose a multidimensional selective item response regression model to describe and test how a risk factor may alter molecular pathways involving aberrant methylation of multiple genes in oncogenesis. Our modeling framework is built on an item response model for multivariate dichotomous responses of high dimension, such as aberrant methylation of multiple tumor-suppressor genes, but we allow risk factors such as SV40 viral infection to alter the distribution of the latent factors that subsequently affect the outcome of cancer. We postulate empirical identification conditions under our model formulation. Moreover, we do not prespecify the links between the multiple dichotomous methylation responses and the latent factors, but rather conduct specification searches with a genetic algorithm to discover the links. Parameter estimation through maximum likelihood and specification searches in models with multidimensional latent factors for multivariate binary responses have become practical only recently, due to modern statistical computing development. We illustrate our proposal with the biological finding that simultaneous methylation of multiple tumor-suppressor genes is associated with the presence of SV40 viral sequences and with the cancer status of lymphoma/leukemia. We are able to test whether the data are consistent with the causal hypothesis that SV40 induces aberrant methylation of multiple genes in its oncogenic pathways. At the same time, we are able to evaluate the role of SV40 in the methylation pathway and to determine whether the methylation pathway is responsible for the development of leukemia/lymphoma.

AB - Alteration of gene methylation patterns has been reported to be involved in the early onsets of many human malignancies. Many exogenous risk factors, such as cigarette smoke, dietary additives, chemical exposures, radiation, and biologic agents including viral infection, are involved in the methylation pathways of cancers. We propose a multidimensional selective item response regression model to describe and test how a risk factor may alter molecular pathways involving aberrant methylation of multiple genes in oncogenesis. Our modeling framework is built on an item response model for multivariate dichotomous responses of high dimension, such as aberrant methylation of multiple tumor-suppressor genes, but we allow risk factors such as SV40 viral infection to alter the distribution of the latent factors that subsequently affect the outcome of cancer. We postulate empirical identification conditions under our model formulation. Moreover, we do not prespecify the links between the multiple dichotomous methylation responses and the latent factors, but rather conduct specification searches with a genetic algorithm to discover the links. Parameter estimation through maximum likelihood and specification searches in models with multidimensional latent factors for multivariate binary responses have become practical only recently, due to modern statistical computing development. We illustrate our proposal with the biological finding that simultaneous methylation of multiple tumor-suppressor genes is associated with the presence of SV40 viral sequences and with the cancer status of lymphoma/leukemia. We are able to test whether the data are consistent with the causal hypothesis that SV40 induces aberrant methylation of multiple genes in its oncogenic pathways. At the same time, we are able to evaluate the role of SV40 in the methylation pathway and to determine whether the methylation pathway is responsible for the development of leukemia/lymphoma.

KW - Biomarker

KW - Causal pathway

KW - Factor analysis

KW - Genetic algorithm

KW - Identification

KW - Item response

KW - Joint model

KW - Latent variable

KW - Specification search

UR - http://www.scopus.com/inward/record.url?scp=42349104040&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=42349104040&partnerID=8YFLogxK

U2 - 10.1198/016214507000000428

DO - 10.1198/016214507000000428

M3 - Article

VL - 103

SP - 201

EP - 211

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

IS - 481

ER -