A Bayesian framework for de novo mutation calling in parents-offspring trios

Qiang Wei, Xiaowei Zhan, Xue Zhong, Yongzhuang Liu, Yujun Han, Wei Chen, Bingshan Li

Research output: Contribution to journalArticle

22 Citations (Scopus)

Abstract

Motivation: Spontaneous (de novo) mutations play an important role in the disease etiology of a range of complex diseases. Identifying de novo mutations (DNMs) in sporadic cases provides an effective strategy to find genes or genomic regions implicated in the genetics of disease. High-throughput next-generation sequencing enables genome-or exome-wide detection of DNMs by sequencing parents-proband trios. It is challenging to sift true mutations through massive amount of noise due to sequencing error and alignment artifacts. One of the critical limitations of existing methods is that for all genomic regions the same pre-specified mutation rate is assumed, which has a significant impact on the DNM calling accuracy. Results: In this study, we developed and implemented a novel Bayesian framework for DNM calling in trios (TrioDeNovo), which overcomes these limitations by disentangling prior mutation rates from evaluation of the likelihood of the data so that flexible priors can be adjusted post-hoc at different genomic sites. Through extensively simulations and application to real data we showed that this new method has improved sensitivity and specificity over existing methods, and provides a flexible framework to further improve the efficiency by incorporating proper priors. The accuracy is further improved using effective filtering based on sequence alignment characteristics.

Original languageEnglish (US)
Pages (from-to)1375-1381
Number of pages7
JournalBioinformatics
Volume31
Issue number9
DOIs
StatePublished - Nov 5 2014

Fingerprint

Mutation
Genes
Mutation Rate
Sequencing
Genomics
Throughput
Exome
Inborn Genetic Diseases
Sequence Alignment
Artifacts
Noise
Framework
Genome
High Throughput
Specificity
Sensitivity and Specificity
Likelihood
Alignment
Filtering
Gene

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability

Cite this

A Bayesian framework for de novo mutation calling in parents-offspring trios. / Wei, Qiang; Zhan, Xiaowei; Zhong, Xue; Liu, Yongzhuang; Han, Yujun; Chen, Wei; Li, Bingshan.

In: Bioinformatics, Vol. 31, No. 9, 05.11.2014, p. 1375-1381.

Research output: Contribution to journalArticle

Wei, Qiang ; Zhan, Xiaowei ; Zhong, Xue ; Liu, Yongzhuang ; Han, Yujun ; Chen, Wei ; Li, Bingshan. / A Bayesian framework for de novo mutation calling in parents-offspring trios. In: Bioinformatics. 2014 ; Vol. 31, No. 9. pp. 1375-1381.
@article{eaa84164f0234f518398eb1ec3486d76,
title = "A Bayesian framework for de novo mutation calling in parents-offspring trios",
abstract = "Motivation: Spontaneous (de novo) mutations play an important role in the disease etiology of a range of complex diseases. Identifying de novo mutations (DNMs) in sporadic cases provides an effective strategy to find genes or genomic regions implicated in the genetics of disease. High-throughput next-generation sequencing enables genome-or exome-wide detection of DNMs by sequencing parents-proband trios. It is challenging to sift true mutations through massive amount of noise due to sequencing error and alignment artifacts. One of the critical limitations of existing methods is that for all genomic regions the same pre-specified mutation rate is assumed, which has a significant impact on the DNM calling accuracy. Results: In this study, we developed and implemented a novel Bayesian framework for DNM calling in trios (TrioDeNovo), which overcomes these limitations by disentangling prior mutation rates from evaluation of the likelihood of the data so that flexible priors can be adjusted post-hoc at different genomic sites. Through extensively simulations and application to real data we showed that this new method has improved sensitivity and specificity over existing methods, and provides a flexible framework to further improve the efficiency by incorporating proper priors. The accuracy is further improved using effective filtering based on sequence alignment characteristics.",
author = "Qiang Wei and Xiaowei Zhan and Xue Zhong and Yongzhuang Liu and Yujun Han and Wei Chen and Bingshan Li",
year = "2014",
month = "11",
day = "5",
doi = "10.1093/bioinformatics/btu839",
language = "English (US)",
volume = "31",
pages = "1375--1381",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "9",

}

TY - JOUR

T1 - A Bayesian framework for de novo mutation calling in parents-offspring trios

AU - Wei, Qiang

AU - Zhan, Xiaowei

AU - Zhong, Xue

AU - Liu, Yongzhuang

AU - Han, Yujun

AU - Chen, Wei

AU - Li, Bingshan

PY - 2014/11/5

Y1 - 2014/11/5

N2 - Motivation: Spontaneous (de novo) mutations play an important role in the disease etiology of a range of complex diseases. Identifying de novo mutations (DNMs) in sporadic cases provides an effective strategy to find genes or genomic regions implicated in the genetics of disease. High-throughput next-generation sequencing enables genome-or exome-wide detection of DNMs by sequencing parents-proband trios. It is challenging to sift true mutations through massive amount of noise due to sequencing error and alignment artifacts. One of the critical limitations of existing methods is that for all genomic regions the same pre-specified mutation rate is assumed, which has a significant impact on the DNM calling accuracy. Results: In this study, we developed and implemented a novel Bayesian framework for DNM calling in trios (TrioDeNovo), which overcomes these limitations by disentangling prior mutation rates from evaluation of the likelihood of the data so that flexible priors can be adjusted post-hoc at different genomic sites. Through extensively simulations and application to real data we showed that this new method has improved sensitivity and specificity over existing methods, and provides a flexible framework to further improve the efficiency by incorporating proper priors. The accuracy is further improved using effective filtering based on sequence alignment characteristics.

AB - Motivation: Spontaneous (de novo) mutations play an important role in the disease etiology of a range of complex diseases. Identifying de novo mutations (DNMs) in sporadic cases provides an effective strategy to find genes or genomic regions implicated in the genetics of disease. High-throughput next-generation sequencing enables genome-or exome-wide detection of DNMs by sequencing parents-proband trios. It is challenging to sift true mutations through massive amount of noise due to sequencing error and alignment artifacts. One of the critical limitations of existing methods is that for all genomic regions the same pre-specified mutation rate is assumed, which has a significant impact on the DNM calling accuracy. Results: In this study, we developed and implemented a novel Bayesian framework for DNM calling in trios (TrioDeNovo), which overcomes these limitations by disentangling prior mutation rates from evaluation of the likelihood of the data so that flexible priors can be adjusted post-hoc at different genomic sites. Through extensively simulations and application to real data we showed that this new method has improved sensitivity and specificity over existing methods, and provides a flexible framework to further improve the efficiency by incorporating proper priors. The accuracy is further improved using effective filtering based on sequence alignment characteristics.

UR - http://www.scopus.com/inward/record.url?scp=84946177607&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946177607&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btu839

DO - 10.1093/bioinformatics/btu839

M3 - Article

VL - 31

SP - 1375

EP - 1381

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 9

ER -