GIGSEA: Genotype imputed gene set enrichment analysis using GWAS summary level data

Shijia Zhu, Tongqi Qian, Yujin Hoshida, Yuan Shen, Jing Yu, Ke Hao

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Summary level data of GWAS becomes increasingly important in post-GWAS data mining. Here, we present GIGSEA (Genotype Imputed Gene Set Enrichment Analysis), a novel method that uses GWAS summary statistics and eQTL to infer differential gene expression and interrogate gene set enrichment for the trait-associated SNPs. By incorporating empirical eQTL of the disease relevant tissue, GIGSEA naturally accounts for factors such as gene size, gene boundary, SNP distal regulation and multiple-marker regulation. The weighted linear regression model was used to perform the enrichment test, properly adjusting for imputation accuracy, model incompleteness and redundancy in different gene sets. The significance level of enrichment is assessed by the permutation test, where matrix operation was employed to dramatically improve computation speed. GIGSEA has appropriate type I error rates, and discovers the plausible biological findings on the real data set. Availability and implementation GIGSEA is implemented in R, and freely available at www.github.com/zhushijia/GIGSEA.

Original languageEnglish (US)
Pages (from-to)160-163
Number of pages4
JournalBioinformatics
Volume35
Issue number1
DOIs
StatePublished - Jan 1 2019

Fingerprint

Genome-Wide Association Study
Genotype
Genes
Gene
Single Nucleotide Polymorphism
Linear Models
Permutation Test
Data Mining
Type I Error Rate
Significance level
Incompleteness
Differential Expression
Imputation
Linear Regression Model
Linear regression
Gene expression
Gene Expression
Redundancy
Data mining
Availability

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

GIGSEA : Genotype imputed gene set enrichment analysis using GWAS summary level data. / Zhu, Shijia; Qian, Tongqi; Hoshida, Yujin; Shen, Yuan; Yu, Jing; Hao, Ke.

In: Bioinformatics, Vol. 35, No. 1, 01.01.2019, p. 160-163.

Research output: Contribution to journalArticle

Zhu, Shijia ; Qian, Tongqi ; Hoshida, Yujin ; Shen, Yuan ; Yu, Jing ; Hao, Ke. / GIGSEA : Genotype imputed gene set enrichment analysis using GWAS summary level data. In: Bioinformatics. 2019 ; Vol. 35, No. 1. pp. 160-163.
@article{9c959b257c0d47c886d1fd2d80ca7903,
title = "GIGSEA: Genotype imputed gene set enrichment analysis using GWAS summary level data",
abstract = "Summary level data of GWAS becomes increasingly important in post-GWAS data mining. Here, we present GIGSEA (Genotype Imputed Gene Set Enrichment Analysis), a novel method that uses GWAS summary statistics and eQTL to infer differential gene expression and interrogate gene set enrichment for the trait-associated SNPs. By incorporating empirical eQTL of the disease relevant tissue, GIGSEA naturally accounts for factors such as gene size, gene boundary, SNP distal regulation and multiple-marker regulation. The weighted linear regression model was used to perform the enrichment test, properly adjusting for imputation accuracy, model incompleteness and redundancy in different gene sets. The significance level of enrichment is assessed by the permutation test, where matrix operation was employed to dramatically improve computation speed. GIGSEA has appropriate type I error rates, and discovers the plausible biological findings on the real data set. Availability and implementation GIGSEA is implemented in R, and freely available at www.github.com/zhushijia/GIGSEA.",
author = "Shijia Zhu and Tongqi Qian and Yujin Hoshida and Yuan Shen and Jing Yu and Ke Hao",
year = "2019",
month = "1",
day = "1",
doi = "10.1093/bioinformatics/bty529",
language = "English (US)",
volume = "35",
pages = "160--163",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "1",

}

TY - JOUR

T1 - GIGSEA

T2 - Genotype imputed gene set enrichment analysis using GWAS summary level data

AU - Zhu, Shijia

AU - Qian, Tongqi

AU - Hoshida, Yujin

AU - Shen, Yuan

AU - Yu, Jing

AU - Hao, Ke

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Summary level data of GWAS becomes increasingly important in post-GWAS data mining. Here, we present GIGSEA (Genotype Imputed Gene Set Enrichment Analysis), a novel method that uses GWAS summary statistics and eQTL to infer differential gene expression and interrogate gene set enrichment for the trait-associated SNPs. By incorporating empirical eQTL of the disease relevant tissue, GIGSEA naturally accounts for factors such as gene size, gene boundary, SNP distal regulation and multiple-marker regulation. The weighted linear regression model was used to perform the enrichment test, properly adjusting for imputation accuracy, model incompleteness and redundancy in different gene sets. The significance level of enrichment is assessed by the permutation test, where matrix operation was employed to dramatically improve computation speed. GIGSEA has appropriate type I error rates, and discovers the plausible biological findings on the real data set. Availability and implementation GIGSEA is implemented in R, and freely available at www.github.com/zhushijia/GIGSEA.

AB - Summary level data of GWAS becomes increasingly important in post-GWAS data mining. Here, we present GIGSEA (Genotype Imputed Gene Set Enrichment Analysis), a novel method that uses GWAS summary statistics and eQTL to infer differential gene expression and interrogate gene set enrichment for the trait-associated SNPs. By incorporating empirical eQTL of the disease relevant tissue, GIGSEA naturally accounts for factors such as gene size, gene boundary, SNP distal regulation and multiple-marker regulation. The weighted linear regression model was used to perform the enrichment test, properly adjusting for imputation accuracy, model incompleteness and redundancy in different gene sets. The significance level of enrichment is assessed by the permutation test, where matrix operation was employed to dramatically improve computation speed. GIGSEA has appropriate type I error rates, and discovers the plausible biological findings on the real data set. Availability and implementation GIGSEA is implemented in R, and freely available at www.github.com/zhushijia/GIGSEA.

UR - http://www.scopus.com/inward/record.url?scp=85058750129&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85058750129&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bty529

DO - 10.1093/bioinformatics/bty529

M3 - Article

C2 - 30010968

AN - SCOPUS:85058750129

VL - 35

SP - 160

EP - 163

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 1

ER -