TY - JOUR
T1 - A Bayesian hierarchical model for analyzing methylated RNA immunoprecipitation sequencing data
AU - Zhang, Minzhe
AU - Li, Qiwei
AU - Xie, Yang
N1 - Funding Information:
The authors would like to thank Jessie Norris for helping with proofreading the manuscript. This work was partially supported by the National Institutes of Health (Nos. R01CA172211, P50CA70907, P30CA142543, R01GM-115473, R01GM117597, R15GM113157, and R01CA152301), and the Cancer Prevention and Research Institute of Texas (No. RP120732).
Publisher Copyright:
© 2018, Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2018/9/1
Y1 - 2018/9/1
N2 - Background: The recently emerged technology of methylated RNA immunoprecipitation sequencing (MeRIP-seq) sheds light on the study of RNA epigenetics. This new bioinformatics question calls for effective and robust peaking calling algorithms to detect mRNA methylation sites from MeRIP-seq data. Methods: We propose a Bayesian hierarchical model to detect methylation sites from MeRIP-seq data. Our modeling approach includes several important characteristics. First, it models the zero-inflated and over-dispersed counts by deploying a zero-inflated negative binomial model. Second, it incorporates a hidden Markov model (HMM) to account for the spatial dependency of neighboring read enrichment. Third, our Bayesian inference allows the proposed model to borrow strength in parameter estimation, which greatly improves the model stability when dealing with MeRIP-seq data with a small number of replicates. We use Markov chain Monte Carlo (MCMC) algorithms to simultaneously infer the model parameters in a de novo fashion. The R Shiny demo is available at https://qiwei.shinyapps.io/BaySeqPeak and the R/C ++ code is available at https://github.com/liqiwei2000/BaySeqPeak. Results: In simulation studies, the proposed method outperformed the competing methods exomePeak and MeTPeak, especially when an excess of zeros were present in the data. In real MeRIP-seq data analysis, the proposed method identified methylation sites that were more consistent with biological knowledge, and had better spatial resolution compared to the other methods. Conclusions: In this study, we develop a Bayesian hierarchical model to identify methylation peaks in MeRIP-seq data. The proposed method has a competitive edge over existing methods in terms of accuracy, robustness and spatial resolution. [Figure not available: see fulltext.].
AB - Background: The recently emerged technology of methylated RNA immunoprecipitation sequencing (MeRIP-seq) sheds light on the study of RNA epigenetics. This new bioinformatics question calls for effective and robust peaking calling algorithms to detect mRNA methylation sites from MeRIP-seq data. Methods: We propose a Bayesian hierarchical model to detect methylation sites from MeRIP-seq data. Our modeling approach includes several important characteristics. First, it models the zero-inflated and over-dispersed counts by deploying a zero-inflated negative binomial model. Second, it incorporates a hidden Markov model (HMM) to account for the spatial dependency of neighboring read enrichment. Third, our Bayesian inference allows the proposed model to borrow strength in parameter estimation, which greatly improves the model stability when dealing with MeRIP-seq data with a small number of replicates. We use Markov chain Monte Carlo (MCMC) algorithms to simultaneously infer the model parameters in a de novo fashion. The R Shiny demo is available at https://qiwei.shinyapps.io/BaySeqPeak and the R/C ++ code is available at https://github.com/liqiwei2000/BaySeqPeak. Results: In simulation studies, the proposed method outperformed the competing methods exomePeak and MeTPeak, especially when an excess of zeros were present in the data. In real MeRIP-seq data analysis, the proposed method identified methylation sites that were more consistent with biological knowledge, and had better spatial resolution compared to the other methods. Conclusions: In this study, we develop a Bayesian hierarchical model to identify methylation peaks in MeRIP-seq data. The proposed method has a competitive edge over existing methods in terms of accuracy, robustness and spatial resolution. [Figure not available: see fulltext.].
KW - Bayesian inference
KW - MeRIP-seq data
KW - RNA epigenomics
KW - hidden Markov model
KW - zero-inflated negative binomial
UR - http://www.scopus.com/inward/record.url?scp=85053228281&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85053228281&partnerID=8YFLogxK
U2 - 10.1007/s40484-018-0149-2
DO - 10.1007/s40484-018-0149-2
M3 - Article
C2 - 33833899
AN - SCOPUS:85053228281
VL - 6
SP - 275
EP - 286
JO - Quantitative Biology
JF - Quantitative Biology
SN - 2095-4689
IS - 3
ER -