A Bayesian approach to joint modeling of protein-DNA binding, gene expression and sequence data

Yang Xie, Wei Pan, Kyeong S. Jeong, Guanghua Xiao, Arkady B. Khodursky

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

The genome-wide DNA-protein-binding data, DNA sequence data and gene expression data represent complementary means to deciphering global and local transcriptional regulatory circuits. Combining these different types of data can not only improve the statistical power, but also provide a more comprehensive picture of gene regulation. In this paper, we propose a novel statistical model to augment protein-DNA-binding data with gene expression and DNA sequence data when available. We specify a hierarchical Bayes model and use Markov chain Monte Carlo simulations to draw inferences. Both simulation studies and an analysis of an experimental data set show that the proposed joint modeling method can significantly improve the specificity and sensitivity of identifying target genes as compared with conventional approaches relying on a single data source.

Original languageEnglish (US)
Pages (from-to)489-503
Number of pages15
JournalStatistics in Medicine
Volume29
Issue number4
DOIs
StatePublished - Feb 20 2010

Fingerprint

DNA-binding Protein
Joint Modeling
Bayes Theorem
DNA-Binding Proteins
Bayesian Approach
Gene Expression
Joints
Markov Chains
Information Storage and Retrieval
Statistical Models
Genes
Genome
Sensitivity and Specificity
DNA Sequence
Hierarchical Bayes
Markov Chain Monte Carlo Simulation
Statistical Power
Gene Regulation
Gene Expression Data
Modeling Method

Keywords

  • Bayesian model
  • ChIP-chip data
  • Joint modeling
  • Microarray

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Cite this

A Bayesian approach to joint modeling of protein-DNA binding, gene expression and sequence data. / Xie, Yang; Pan, Wei; Jeong, Kyeong S.; Xiao, Guanghua; Khodursky, Arkady B.

In: Statistics in Medicine, Vol. 29, No. 4, 20.02.2010, p. 489-503.

Research output: Contribution to journalArticle

Xie, Yang ; Pan, Wei ; Jeong, Kyeong S. ; Xiao, Guanghua ; Khodursky, Arkady B. / A Bayesian approach to joint modeling of protein-DNA binding, gene expression and sequence data. In: Statistics in Medicine. 2010 ; Vol. 29, No. 4. pp. 489-503.
@article{bf46cd0340b743189890b9369c751f0a,
title = "A Bayesian approach to joint modeling of protein-DNA binding, gene expression and sequence data",
abstract = "The genome-wide DNA-protein-binding data, DNA sequence data and gene expression data represent complementary means to deciphering global and local transcriptional regulatory circuits. Combining these different types of data can not only improve the statistical power, but also provide a more comprehensive picture of gene regulation. In this paper, we propose a novel statistical model to augment protein-DNA-binding data with gene expression and DNA sequence data when available. We specify a hierarchical Bayes model and use Markov chain Monte Carlo simulations to draw inferences. Both simulation studies and an analysis of an experimental data set show that the proposed joint modeling method can significantly improve the specificity and sensitivity of identifying target genes as compared with conventional approaches relying on a single data source.",
keywords = "Bayesian model, ChIP-chip data, Joint modeling, Microarray",
author = "Yang Xie and Wei Pan and Jeong, {Kyeong S.} and Guanghua Xiao and Khodursky, {Arkady B.}",
year = "2010",
month = "2",
day = "20",
doi = "10.1002/sim.3815",
language = "English (US)",
volume = "29",
pages = "489--503",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "4",

}

TY - JOUR

T1 - A Bayesian approach to joint modeling of protein-DNA binding, gene expression and sequence data

AU - Xie, Yang

AU - Pan, Wei

AU - Jeong, Kyeong S.

AU - Xiao, Guanghua

AU - Khodursky, Arkady B.

PY - 2010/2/20

Y1 - 2010/2/20

N2 - The genome-wide DNA-protein-binding data, DNA sequence data and gene expression data represent complementary means to deciphering global and local transcriptional regulatory circuits. Combining these different types of data can not only improve the statistical power, but also provide a more comprehensive picture of gene regulation. In this paper, we propose a novel statistical model to augment protein-DNA-binding data with gene expression and DNA sequence data when available. We specify a hierarchical Bayes model and use Markov chain Monte Carlo simulations to draw inferences. Both simulation studies and an analysis of an experimental data set show that the proposed joint modeling method can significantly improve the specificity and sensitivity of identifying target genes as compared with conventional approaches relying on a single data source.

AB - The genome-wide DNA-protein-binding data, DNA sequence data and gene expression data represent complementary means to deciphering global and local transcriptional regulatory circuits. Combining these different types of data can not only improve the statistical power, but also provide a more comprehensive picture of gene regulation. In this paper, we propose a novel statistical model to augment protein-DNA-binding data with gene expression and DNA sequence data when available. We specify a hierarchical Bayes model and use Markov chain Monte Carlo simulations to draw inferences. Both simulation studies and an analysis of an experimental data set show that the proposed joint modeling method can significantly improve the specificity and sensitivity of identifying target genes as compared with conventional approaches relying on a single data source.

KW - Bayesian model

KW - ChIP-chip data

KW - Joint modeling

KW - Microarray

UR - http://www.scopus.com/inward/record.url?scp=76549101232&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=76549101232&partnerID=8YFLogxK

U2 - 10.1002/sim.3815

DO - 10.1002/sim.3815

M3 - Article

VL - 29

SP - 489

EP - 503

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 4

ER -