TY - JOUR
T1 - Analysis of archived residual newborn screening blood spots after whole genome amplification
AU - Cantarel, Brandi L.
AU - Lei, Yunping
AU - Weaver, Daniel
AU - Zhu, Huiping
AU - Farrell, Andrew
AU - Benstead-Hume, Graeme
AU - Reese, Justin
AU - Finnell, Richard H.
N1 - Funding Information:
This work was supported by National Institute of Health (Grant No. P01HD067244, NS076465, and R01ES021006) to RHF.
Publisher Copyright:
© 2015 Cantarel et al.
PY - 2015/8/13
Y1 - 2015/8/13
N2 - Background: Deidentified newborn screening bloodspot samples (NBS) represent a valuable potential resource for genomic research if impediments to whole exome sequencing of NBS deoxyribonucleic acid (DNA), including the small amount of genomic DNA in NBS material, can be overcome. For instance, genomic analysis of NBS could be used to define allele frequencies of disease-associated variants in local populations, or to conduct prospective or retrospective studies relating genomic variation to disease emergence in pediatric populations over time. In this study, we compared the recovery of variant calls from exome sequences of amplified NBS genomic DNA to variant calls from exome sequencing of non-amplified NBS DNA from the same individuals. Results: Using a standard alignment-based Genome Analysis Toolkit (GATK), we find 62,000-76,000 additional variants in amplified samples. After application of a unique kmer enumeration and variant detection method (RUFUS), only 38,000-47,000 additional variants are observed in amplified gDNA. This result suggests that roughly half of the amplification-introduced variants identified using GATK may be the result of mapping errors and read misalignment. Conclusions: Our results show that it is possible to obtain informative, high-quality data from exome analysis of whole genome amplified NBS with the important caveat that different data generation and analysis methods can affect variant detection accuracy, and the concordance of variant calls in whole-genome amplified and non-amplified exomes.
AB - Background: Deidentified newborn screening bloodspot samples (NBS) represent a valuable potential resource for genomic research if impediments to whole exome sequencing of NBS deoxyribonucleic acid (DNA), including the small amount of genomic DNA in NBS material, can be overcome. For instance, genomic analysis of NBS could be used to define allele frequencies of disease-associated variants in local populations, or to conduct prospective or retrospective studies relating genomic variation to disease emergence in pediatric populations over time. In this study, we compared the recovery of variant calls from exome sequences of amplified NBS genomic DNA to variant calls from exome sequencing of non-amplified NBS DNA from the same individuals. Results: Using a standard alignment-based Genome Analysis Toolkit (GATK), we find 62,000-76,000 additional variants in amplified samples. After application of a unique kmer enumeration and variant detection method (RUFUS), only 38,000-47,000 additional variants are observed in amplified gDNA. This result suggests that roughly half of the amplification-introduced variants identified using GATK may be the result of mapping errors and read misalignment. Conclusions: Our results show that it is possible to obtain informative, high-quality data from exome analysis of whole genome amplified NBS with the important caveat that different data generation and analysis methods can affect variant detection accuracy, and the concordance of variant calls in whole-genome amplified and non-amplified exomes.
UR - http://www.scopus.com/inward/record.url?scp=84939173828&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84939173828&partnerID=8YFLogxK
U2 - 10.1186/s12864-015-1747-2
DO - 10.1186/s12864-015-1747-2
M3 - Article
C2 - 26268606
AN - SCOPUS:84939173828
SN - 1471-2164
VL - 16
JO - BMC Genomics
JF - BMC Genomics
IS - 1
M1 - 602
ER -