Integrative gene set enrichment analysis utilizing isoform-specific expression

Lie Li, Xinlei Wang, Guanghua Xiao, Adi Gazdar

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Gene set enrichment analysis (GSEA) aims at identifying essential pathways, or more generally, sets of biologically related genes that are involved in complex human diseases. In the past, many studies have shown that GSEA is a very useful bioinformatics tool that plays critical roles in the innovation of disease prevention and intervention strategies. Despite its tremendous success, it is striking that conclusions of GSEA drawn from isolated studies are often sparse, and different studies may lead to inconsistent and sometimes contradictory results. Further, in the wake of next generation sequencing technologies, it has been made possible to measure genome-wide isoform-specific expression levels, calling for innovations that can utilize the unprecedented resolution. Currently, enormous amounts of data have been created from various RNA-seq experiments. All these give rise to a pressing need for developing integrative methods that allow for explicit utilization of isoform-specific expression, to combine multiple enrichment studies, in order to enhance the power, reproducibility, and interpretability of the analysis. We develop and evaluate integrative GSEA methods, based on two-stage procedures, which, for the first time, allow statistically efficient use of isoform-specific expression from multiple RNA-seq experiments. Through simulation and real data analysis, we show that our methods can greatly improve the performance in identifying essential gene sets compared to existing methods that can only use gene-level expression.

Original languageEnglish (US)
JournalGenetic Epidemiology
DOIs
StateAccepted/In press - 2017

Fingerprint

Protein Isoforms
Genes
RNA
Essential Genes
Computational Biology
Genome
Technology
Gene Expression

Keywords

  • Fixed effect
  • GLM
  • Integrative GSEA
  • Pathway analysis
  • Random effects
  • RNA-seq
  • Score statistic

ASJC Scopus subject areas

  • Epidemiology
  • Genetics(clinical)

Cite this

Integrative gene set enrichment analysis utilizing isoform-specific expression. / Li, Lie; Wang, Xinlei; Xiao, Guanghua; Gazdar, Adi.

In: Genetic Epidemiology, 2017.

Research output: Contribution to journalArticle

@article{63100edc6ffc41da8f5fdd6d8c8095c7,
title = "Integrative gene set enrichment analysis utilizing isoform-specific expression",
abstract = "Gene set enrichment analysis (GSEA) aims at identifying essential pathways, or more generally, sets of biologically related genes that are involved in complex human diseases. In the past, many studies have shown that GSEA is a very useful bioinformatics tool that plays critical roles in the innovation of disease prevention and intervention strategies. Despite its tremendous success, it is striking that conclusions of GSEA drawn from isolated studies are often sparse, and different studies may lead to inconsistent and sometimes contradictory results. Further, in the wake of next generation sequencing technologies, it has been made possible to measure genome-wide isoform-specific expression levels, calling for innovations that can utilize the unprecedented resolution. Currently, enormous amounts of data have been created from various RNA-seq experiments. All these give rise to a pressing need for developing integrative methods that allow for explicit utilization of isoform-specific expression, to combine multiple enrichment studies, in order to enhance the power, reproducibility, and interpretability of the analysis. We develop and evaluate integrative GSEA methods, based on two-stage procedures, which, for the first time, allow statistically efficient use of isoform-specific expression from multiple RNA-seq experiments. Through simulation and real data analysis, we show that our methods can greatly improve the performance in identifying essential gene sets compared to existing methods that can only use gene-level expression.",
keywords = "Fixed effect, GLM, Integrative GSEA, Pathway analysis, Random effects, RNA-seq, Score statistic",
author = "Lie Li and Xinlei Wang and Guanghua Xiao and Adi Gazdar",
year = "2017",
doi = "10.1002/gepi.22052",
language = "English (US)",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",

}

TY - JOUR

T1 - Integrative gene set enrichment analysis utilizing isoform-specific expression

AU - Li, Lie

AU - Wang, Xinlei

AU - Xiao, Guanghua

AU - Gazdar, Adi

PY - 2017

Y1 - 2017

N2 - Gene set enrichment analysis (GSEA) aims at identifying essential pathways, or more generally, sets of biologically related genes that are involved in complex human diseases. In the past, many studies have shown that GSEA is a very useful bioinformatics tool that plays critical roles in the innovation of disease prevention and intervention strategies. Despite its tremendous success, it is striking that conclusions of GSEA drawn from isolated studies are often sparse, and different studies may lead to inconsistent and sometimes contradictory results. Further, in the wake of next generation sequencing technologies, it has been made possible to measure genome-wide isoform-specific expression levels, calling for innovations that can utilize the unprecedented resolution. Currently, enormous amounts of data have been created from various RNA-seq experiments. All these give rise to a pressing need for developing integrative methods that allow for explicit utilization of isoform-specific expression, to combine multiple enrichment studies, in order to enhance the power, reproducibility, and interpretability of the analysis. We develop and evaluate integrative GSEA methods, based on two-stage procedures, which, for the first time, allow statistically efficient use of isoform-specific expression from multiple RNA-seq experiments. Through simulation and real data analysis, we show that our methods can greatly improve the performance in identifying essential gene sets compared to existing methods that can only use gene-level expression.

AB - Gene set enrichment analysis (GSEA) aims at identifying essential pathways, or more generally, sets of biologically related genes that are involved in complex human diseases. In the past, many studies have shown that GSEA is a very useful bioinformatics tool that plays critical roles in the innovation of disease prevention and intervention strategies. Despite its tremendous success, it is striking that conclusions of GSEA drawn from isolated studies are often sparse, and different studies may lead to inconsistent and sometimes contradictory results. Further, in the wake of next generation sequencing technologies, it has been made possible to measure genome-wide isoform-specific expression levels, calling for innovations that can utilize the unprecedented resolution. Currently, enormous amounts of data have been created from various RNA-seq experiments. All these give rise to a pressing need for developing integrative methods that allow for explicit utilization of isoform-specific expression, to combine multiple enrichment studies, in order to enhance the power, reproducibility, and interpretability of the analysis. We develop and evaluate integrative GSEA methods, based on two-stage procedures, which, for the first time, allow statistically efficient use of isoform-specific expression from multiple RNA-seq experiments. Through simulation and real data analysis, we show that our methods can greatly improve the performance in identifying essential gene sets compared to existing methods that can only use gene-level expression.

KW - Fixed effect

KW - GLM

KW - Integrative GSEA

KW - Pathway analysis

KW - Random effects

KW - RNA-seq

KW - Score statistic

UR - http://www.scopus.com/inward/record.url?scp=85020190314&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85020190314&partnerID=8YFLogxK

U2 - 10.1002/gepi.22052

DO - 10.1002/gepi.22052

M3 - Article

C2 - 28580727

AN - SCOPUS:85020190314

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

ER -