Gene function prediction by a combined analysis of gene expression data and protein-protein interaction data

Guanghua Xiao, Wei Pan

Research output: Contribution to journalArticlepeer-review

18 Scopus citations

Abstract

Prediction of biological functions of genes is an important issue in basic biology research and has applications in drug discoveries and gene therapies. Previous studies have shown either gene expression data or protein-protein interaction data alone can be used for predicting gene functions. In particular, clustering gene expression profiles has been widely used for gene function prediction. In this paper, we first propose a new method for gene function prediction using protein-protein interaction data, which will facilitate combining prediction results based on clustering gene expression profiles. We then propose a new method to combine the prediction results based on either source of data by weighting on the evidence provided by each. Using protein-protein interaction data downloaded from the GRID database, published gene expression profiles from 300 microarray experiments for the yeast S. cerevisiae, we show that this new combined analysis provides improved predictive performance over that of using either data source alone in a cross-validated analysis of the MIPS gene annotations. Finally, we propose a logistic regression method that is flexible enough to combine information from any number of data sources while maintaining computational feasibility.

Original languageEnglish (US)
Pages (from-to)1371-1389
Number of pages19
JournalJournal of Bioinformatics and Computational Biology
Volume3
Issue number6
DOIs
StatePublished - Dec 1 2005

Keywords

  • Cluster analysis
  • Combining p-values
  • Cross-validation
  • Logistic regression
  • Naive Bayes
  • Weighted average

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Gene function prediction by a combined analysis of gene expression data and protein-protein interaction data'. Together they form a unique fingerprint.

Cite this