Statistical completion of a partially identified graph with applications for the estimation of gene regulatory networks

Donghyeon Yu, Won Son, Johan Lim, Guanghua Xiao

Research output: Contribution to journalArticle

9 Scopus citations

Abstract

We study the estimation of a Gaussian graphical model whose dependent structures are partially identified. In a Gaussian graphical model, an off-diagonal zero entry in the concentration matrix (the inverse covariance matrix) implies the conditional independence of two corresponding variables, given all other variables. A number of methods have been proposed to estimate a sparse large-scale Gaussian graphical model or, equivalently, a sparse large-scale concentration matrix. In practice, the graph structure to be estimated is often partially identified by other sources or a pre-screening. In this paper, we propose a simple modification of existing methods to take into account this information in the estimation. We show that the partially identified dependent structure reduces the error in estimating the dependent structure. We apply the proposed method to estimating the gene regulatory network from lung cancer data, where protein-protein interactions are partially identified from the human protein reference database. The application shows that proposed method identified many important cancer genes as hub genes in the constructed lung cancer network. In addition, we validated the prognostic importance of a newly identified cancer gene, PTPN13, in four independent lung cancer datasets. The results indicate that the proposed method could facilitate studying underlying lung cancer mechanisms and identifying reliable biomarkers for lung cancer prognosis.

Original languageEnglish (US)
Pages (from-to)670-685
Number of pages16
JournalBiostatistics
Volume16
Issue number4
DOIs
StatePublished - Oct 1 2015

Keywords

  • Concentration matrix
  • Gaussian graphical models
  • Gene regulatory network
  • Lung cancer
  • Partially identified graph
  • Protein-protein interaction

ASJC Scopus subject areas

  • Medicine(all)
  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'Statistical completion of a partially identified graph with applications for the estimation of gene regulatory networks'. Together they form a unique fingerprint.

  • Cite this