Massively parallel sequencing of the mouse exome to accurately identify rare, induced mutations: An immediate source for thousands of new mouse models

T. D. Andrews, B. Whittle, M. A. Field, B. Balakishnan, Y. Zhang, Y. Shao, V. Cho, M. Kirk, M. Singh, Y. Xia, J. Hager, S. Winslade, G. Sjollema, B. Beutler, A. Enders, C. C. Goodnow

Research output: Contribution to journalArticle

70 Citations (Scopus)

Abstract

Accurate identification of sparse heterozygous single-nucleotide variants (SNVs) is a critical challenge for identifying the causative mutations in mouse genetic screens, human genetic diseases and cancer. When seeking to identify causal DNA variants that occur at such low rates, they are overwhelmed by falsepositive calls that arise from a range of technical and biological sources. We describe a strategy using whole-exome capture, massively parallel DNA sequencing and computational analysis, which identifies with a low false-positive rate the majority of heterozygous and homozygous SNVs arising de novo with a frequency of one nucleotide substitution per megabase in progeny of N-ethyl- N-nitrosourea (ENU)-mutated C57BL/6j mice. We found that by applying a strategy of filtering raw SNV calls against known and platform-specific variants we could call true SNVs with a false-positive rate of 19.4 per cent and an estimated false-negative rate of 21.3 per cent. These error rates are small enough to enable calling a causative mutation from both homozygous and heterozygous candidate mutation lists with little or no further experimental validation. The efficacy of this approach is demonstrated by identifying the causative mutation in the Ptprc gene in a lymphocyte-deficient strain and in 11 other strains with immune disorders or obesity, without the need for meiotic mapping. Exome sequencing of first-generation mutant mice revealed hundreds of unphenotyped protein-changing mutations, 52 per cent of which are predicted to be deleterious, which now become available for breeding and experimental analysis. We show that exome sequencing data alone are sufficient to identify induced mutations. This approach transforms genetic screens in mice, establishes a general strategy for analysing rare DNA variants and opens up a large new source for experimental models of human disease.

Original languageEnglish (US)
Article number120061
JournalOpen Biology
Volume2
Issue numberMAY
DOIs
StatePublished - 2012

Fingerprint

Exome
High-Throughput Nucleotide Sequencing
Nucleotides
Mutation
DNA
Ethylnitrosourea
Lymphocytes
Inborn Genetic Diseases
Immune System Diseases
Medical Genetics
Identification (control systems)
DNA Sequence Analysis
Inbred C57BL Mouse
Substitution reactions
Genes
Breeding
Theoretical Models
Obesity
Proteins
Neoplasms

Keywords

  • DNA capture
  • Exome sequencing
  • Mouse
  • Mutation detection
  • N-ethyl-Nnitrosourea mutagenesis
  • Variation detection

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Neuroscience(all)
  • Immunology

Cite this

Massively parallel sequencing of the mouse exome to accurately identify rare, induced mutations : An immediate source for thousands of new mouse models. / Andrews, T. D.; Whittle, B.; Field, M. A.; Balakishnan, B.; Zhang, Y.; Shao, Y.; Cho, V.; Kirk, M.; Singh, M.; Xia, Y.; Hager, J.; Winslade, S.; Sjollema, G.; Beutler, B.; Enders, A.; Goodnow, C. C.

In: Open Biology, Vol. 2, No. MAY, 120061, 2012.

Research output: Contribution to journalArticle

Andrews, TD, Whittle, B, Field, MA, Balakishnan, B, Zhang, Y, Shao, Y, Cho, V, Kirk, M, Singh, M, Xia, Y, Hager, J, Winslade, S, Sjollema, G, Beutler, B, Enders, A & Goodnow, CC 2012, 'Massively parallel sequencing of the mouse exome to accurately identify rare, induced mutations: An immediate source for thousands of new mouse models', Open Biology, vol. 2, no. MAY, 120061. https://doi.org/10.1098/rsob.120061
Andrews, T. D. ; Whittle, B. ; Field, M. A. ; Balakishnan, B. ; Zhang, Y. ; Shao, Y. ; Cho, V. ; Kirk, M. ; Singh, M. ; Xia, Y. ; Hager, J. ; Winslade, S. ; Sjollema, G. ; Beutler, B. ; Enders, A. ; Goodnow, C. C. / Massively parallel sequencing of the mouse exome to accurately identify rare, induced mutations : An immediate source for thousands of new mouse models. In: Open Biology. 2012 ; Vol. 2, No. MAY.
@article{93ec706c18a14b848ea00e85993689b0,
title = "Massively parallel sequencing of the mouse exome to accurately identify rare, induced mutations: An immediate source for thousands of new mouse models",
abstract = "Accurate identification of sparse heterozygous single-nucleotide variants (SNVs) is a critical challenge for identifying the causative mutations in mouse genetic screens, human genetic diseases and cancer. When seeking to identify causal DNA variants that occur at such low rates, they are overwhelmed by falsepositive calls that arise from a range of technical and biological sources. We describe a strategy using whole-exome capture, massively parallel DNA sequencing and computational analysis, which identifies with a low false-positive rate the majority of heterozygous and homozygous SNVs arising de novo with a frequency of one nucleotide substitution per megabase in progeny of N-ethyl- N-nitrosourea (ENU)-mutated C57BL/6j mice. We found that by applying a strategy of filtering raw SNV calls against known and platform-specific variants we could call true SNVs with a false-positive rate of 19.4 per cent and an estimated false-negative rate of 21.3 per cent. These error rates are small enough to enable calling a causative mutation from both homozygous and heterozygous candidate mutation lists with little or no further experimental validation. The efficacy of this approach is demonstrated by identifying the causative mutation in the Ptprc gene in a lymphocyte-deficient strain and in 11 other strains with immune disorders or obesity, without the need for meiotic mapping. Exome sequencing of first-generation mutant mice revealed hundreds of unphenotyped protein-changing mutations, 52 per cent of which are predicted to be deleterious, which now become available for breeding and experimental analysis. We show that exome sequencing data alone are sufficient to identify induced mutations. This approach transforms genetic screens in mice, establishes a general strategy for analysing rare DNA variants and opens up a large new source for experimental models of human disease.",
keywords = "DNA capture, Exome sequencing, Mouse, Mutation detection, N-ethyl-Nnitrosourea mutagenesis, Variation detection",
author = "Andrews, {T. D.} and B. Whittle and Field, {M. A.} and B. Balakishnan and Y. Zhang and Y. Shao and V. Cho and M. Kirk and M. Singh and Y. Xia and J. Hager and S. Winslade and G. Sjollema and B. Beutler and A. Enders and Goodnow, {C. C.}",
year = "2012",
doi = "10.1098/rsob.120061",
language = "English (US)",
volume = "2",
journal = "Open Biology",
issn = "2046-2441",
publisher = "Royal Society Publishing",
number = "MAY",

}

TY - JOUR

T1 - Massively parallel sequencing of the mouse exome to accurately identify rare, induced mutations

T2 - An immediate source for thousands of new mouse models

AU - Andrews, T. D.

AU - Whittle, B.

AU - Field, M. A.

AU - Balakishnan, B.

AU - Zhang, Y.

AU - Shao, Y.

AU - Cho, V.

AU - Kirk, M.

AU - Singh, M.

AU - Xia, Y.

AU - Hager, J.

AU - Winslade, S.

AU - Sjollema, G.

AU - Beutler, B.

AU - Enders, A.

AU - Goodnow, C. C.

PY - 2012

Y1 - 2012

N2 - Accurate identification of sparse heterozygous single-nucleotide variants (SNVs) is a critical challenge for identifying the causative mutations in mouse genetic screens, human genetic diseases and cancer. When seeking to identify causal DNA variants that occur at such low rates, they are overwhelmed by falsepositive calls that arise from a range of technical and biological sources. We describe a strategy using whole-exome capture, massively parallel DNA sequencing and computational analysis, which identifies with a low false-positive rate the majority of heterozygous and homozygous SNVs arising de novo with a frequency of one nucleotide substitution per megabase in progeny of N-ethyl- N-nitrosourea (ENU)-mutated C57BL/6j mice. We found that by applying a strategy of filtering raw SNV calls against known and platform-specific variants we could call true SNVs with a false-positive rate of 19.4 per cent and an estimated false-negative rate of 21.3 per cent. These error rates are small enough to enable calling a causative mutation from both homozygous and heterozygous candidate mutation lists with little or no further experimental validation. The efficacy of this approach is demonstrated by identifying the causative mutation in the Ptprc gene in a lymphocyte-deficient strain and in 11 other strains with immune disorders or obesity, without the need for meiotic mapping. Exome sequencing of first-generation mutant mice revealed hundreds of unphenotyped protein-changing mutations, 52 per cent of which are predicted to be deleterious, which now become available for breeding and experimental analysis. We show that exome sequencing data alone are sufficient to identify induced mutations. This approach transforms genetic screens in mice, establishes a general strategy for analysing rare DNA variants and opens up a large new source for experimental models of human disease.

AB - Accurate identification of sparse heterozygous single-nucleotide variants (SNVs) is a critical challenge for identifying the causative mutations in mouse genetic screens, human genetic diseases and cancer. When seeking to identify causal DNA variants that occur at such low rates, they are overwhelmed by falsepositive calls that arise from a range of technical and biological sources. We describe a strategy using whole-exome capture, massively parallel DNA sequencing and computational analysis, which identifies with a low false-positive rate the majority of heterozygous and homozygous SNVs arising de novo with a frequency of one nucleotide substitution per megabase in progeny of N-ethyl- N-nitrosourea (ENU)-mutated C57BL/6j mice. We found that by applying a strategy of filtering raw SNV calls against known and platform-specific variants we could call true SNVs with a false-positive rate of 19.4 per cent and an estimated false-negative rate of 21.3 per cent. These error rates are small enough to enable calling a causative mutation from both homozygous and heterozygous candidate mutation lists with little or no further experimental validation. The efficacy of this approach is demonstrated by identifying the causative mutation in the Ptprc gene in a lymphocyte-deficient strain and in 11 other strains with immune disorders or obesity, without the need for meiotic mapping. Exome sequencing of first-generation mutant mice revealed hundreds of unphenotyped protein-changing mutations, 52 per cent of which are predicted to be deleterious, which now become available for breeding and experimental analysis. We show that exome sequencing data alone are sufficient to identify induced mutations. This approach transforms genetic screens in mice, establishes a general strategy for analysing rare DNA variants and opens up a large new source for experimental models of human disease.

KW - DNA capture

KW - Exome sequencing

KW - Mouse

KW - Mutation detection

KW - N-ethyl-Nnitrosourea mutagenesis

KW - Variation detection

UR - http://www.scopus.com/inward/record.url?scp=84864232091&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84864232091&partnerID=8YFLogxK

U2 - 10.1098/rsob.120061

DO - 10.1098/rsob.120061

M3 - Article

C2 - 22724066

AN - SCOPUS:84864232091

VL - 2

JO - Open Biology

JF - Open Biology

SN - 2046-2441

IS - MAY

M1 - 120061

ER -