TY - JOUR
T1 - Mutation severity spectrum of rare alleles in the human genome is predictive of disease type
AU - Pei, Jimin
AU - Kinch, Lisa N.
AU - Otwinowski, Zbyszek
AU - Grishin, Nick V.
N1 - Funding Information:
The study is supported in part by the grants (to NVG) from the National Institutes of Health (GM127390) and the Welch Foundation (I-1505). The sponsors or funders did not play any role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. We thank Alex Treacher and Grishin lab members, Jing Zhang in particular, for helpful discussions.
Publisher Copyright:
© 2020 Pei et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2020/5
Y1 - 2020/5
N2 - The human genome harbors a variety of genetic variations. Single-nucleotide changes that alter amino acids in protein-coding regions are one of the major causes of human phenotypic variation and diseases. These single-amino acid variations (SAVs) are routinely found in whole genome and exome sequencing. Evaluating the functional impact of such genomic alterations is crucial for diagnosis of genetic disorders. We developed DeepSAV, a deep-learning convolutional neural network to differentiate disease-causing and benign SAVs based on a variety of protein sequence, structural and functional properties. Our method outperforms most stand-alone programs, and the version incorporating population and gene-level information (DeepSAV+PG) has similar predictive power as some of the best available. We transformed DeepSAV scores of rare SAVs in the human population into a quantity termed “mutation severity measure” for each human protein-coding gene. It reflects a gene’s tolerance to deleterious missense mutations and serves as a useful tool to study gene-disease associations. Genes implicated in cancer, autism, and viral interaction are found by this measure as intolerant to mutations, while genes associated with a number of other diseases are scored as tolerant. Among known disease-associated genes, those that are mutation-intolerant are likely to function in development and signal transduction pathways, while those that are mutation-tolerant tend to encode metabolic and mitochondrial proteins.
AB - The human genome harbors a variety of genetic variations. Single-nucleotide changes that alter amino acids in protein-coding regions are one of the major causes of human phenotypic variation and diseases. These single-amino acid variations (SAVs) are routinely found in whole genome and exome sequencing. Evaluating the functional impact of such genomic alterations is crucial for diagnosis of genetic disorders. We developed DeepSAV, a deep-learning convolutional neural network to differentiate disease-causing and benign SAVs based on a variety of protein sequence, structural and functional properties. Our method outperforms most stand-alone programs, and the version incorporating population and gene-level information (DeepSAV+PG) has similar predictive power as some of the best available. We transformed DeepSAV scores of rare SAVs in the human population into a quantity termed “mutation severity measure” for each human protein-coding gene. It reflects a gene’s tolerance to deleterious missense mutations and serves as a useful tool to study gene-disease associations. Genes implicated in cancer, autism, and viral interaction are found by this measure as intolerant to mutations, while genes associated with a number of other diseases are scored as tolerant. Among known disease-associated genes, those that are mutation-intolerant are likely to function in development and signal transduction pathways, while those that are mutation-tolerant tend to encode metabolic and mitochondrial proteins.
UR - http://www.scopus.com/inward/record.url?scp=85085664819&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085664819&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1007775
DO - 10.1371/journal.pcbi.1007775
M3 - Article
C2 - 32413045
AN - SCOPUS:85085664819
SN - 1553-734X
VL - 16
JO - PLoS Computational Biology
JF - PLoS Computational Biology
IS - 5
M1 - e1007775
ER -