Harmonizing Genetic Ancestry and Self-identified Race/Ethnicity in Genome-wide Association Studies

the VA Million Veteran Program

Research output: Contribution to journalArticlepeer-review

100 Scopus citations

Abstract

Large-scale multi-ethnic cohorts offer unprecedented opportunities to elucidate the genetic factors influencing complex traits related to health and disease among minority populations. At the same time, the genetic diversity in these cohorts presents new challenges for analysis and interpretation. We consider the utility of race and/or ethnicity categories in genome-wide association studies (GWASs) of multi-ethnic cohorts. We demonstrate that race/ethnicity information enhances the ability to understand population-specific genetic architecture. To address the practical issue that self-identified racial/ethnic information may be incomplete, we propose a machine learning algorithm that produces a surrogate variable, termed HARE. We use height as a model trait to demonstrate the utility of HARE and ethnicity-specific GWASs.

Original languageEnglish (US)
Pages (from-to)763-772
Number of pages10
JournalAmerican Journal of Human Genetics
Volume105
Issue number4
DOIs
StatePublished - Oct 3 2019

Keywords

  • biobank
  • ethnicity-specific trait loci
  • genetic ancestry
  • multi-ethnic cohort
  • self-reported race/ethnicity
  • stratified analysis
  • trans-ethnic GWAS

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Fingerprint

Dive into the research topics of 'Harmonizing Genetic Ancestry and Self-identified Race/Ethnicity in Genome-wide Association Studies'. Together they form a unique fingerprint.

Cite this