TY - JOUR
T1 - GIGSEA
T2 - Genotype imputed gene set enrichment analysis using GWAS summary level data
AU - Zhu, Shijia
AU - Qian, Tongqi
AU - Hoshida, Yujin
AU - Shen, Yuan
AU - Yu, Jing
AU - Hao, Ke
N1 - Funding Information:
Partially supported by National Natural Science Foundation of China (No. 21477087, 91643201), Minister of Science and Technology of China (2016YFC0206507), NIH/NIDDK (R01DK106593 and U24DK062429), and NIH/NIEHS (1R01ES029212-01).
Funding Information:
1Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA and 2Liver Tumor Translational Research Program, Simmons Comprehensive Cancer Center, Division of Digestive and Liver Diseases, Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA and 3Department of Psychiatry and 4Department of Ophthalmology and 5Department of Respiratory Medicine, Shanghai Tenth People’s Hospital, Tongji University, Shanghai, 200092, China
Publisher Copyright:
© 2018 The Author(s). Published by Oxford University Press. All rights reserved.
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Summary level data of GWAS becomes increasingly important in post-GWAS data mining. Here, we present GIGSEA (Genotype Imputed Gene Set Enrichment Analysis), a novel method that uses GWAS summary statistics and eQTL to infer differential gene expression and interrogate gene set enrichment for the trait-associated SNPs. By incorporating empirical eQTL of the disease relevant tissue, GIGSEA naturally accounts for factors such as gene size, gene boundary, SNP distal regulation and multiple-marker regulation. The weighted linear regression model was used to perform the enrichment test, properly adjusting for imputation accuracy, model incompleteness and redundancy in different gene sets. The significance level of enrichment is assessed by the permutation test, where matrix operation was employed to dramatically improve computation speed. GIGSEA has appropriate type I error rates, and discovers the plausible biological findings on the real data set. Availability and implementation GIGSEA is implemented in R, and freely available at www.github.com/zhushijia/GIGSEA.
AB - Summary level data of GWAS becomes increasingly important in post-GWAS data mining. Here, we present GIGSEA (Genotype Imputed Gene Set Enrichment Analysis), a novel method that uses GWAS summary statistics and eQTL to infer differential gene expression and interrogate gene set enrichment for the trait-associated SNPs. By incorporating empirical eQTL of the disease relevant tissue, GIGSEA naturally accounts for factors such as gene size, gene boundary, SNP distal regulation and multiple-marker regulation. The weighted linear regression model was used to perform the enrichment test, properly adjusting for imputation accuracy, model incompleteness and redundancy in different gene sets. The significance level of enrichment is assessed by the permutation test, where matrix operation was employed to dramatically improve computation speed. GIGSEA has appropriate type I error rates, and discovers the plausible biological findings on the real data set. Availability and implementation GIGSEA is implemented in R, and freely available at www.github.com/zhushijia/GIGSEA.
UR - http://www.scopus.com/inward/record.url?scp=85058750129&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85058750129&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bty529
DO - 10.1093/bioinformatics/bty529
M3 - Article
C2 - 30010968
AN - SCOPUS:85058750129
VL - 35
SP - 160
EP - 163
JO - Bioinformatics
JF - Bioinformatics
SN - 1367-4803
IS - 1
ER -