Skip to main content
Premium Trial:

Request an Annual Quote

TOPMed Researchers Offer Guidance for Use of Race, Ethnicity, Ancestry Data in Genomic Research

NEW YORK — Researchers from the Trans-Omics for Precision Medicine (TOPMed) program have developed recommendations on how to use race, ethnicity, and ancestry data in genomic research.

While race has been recognized as a socio-political construct, its correlation with genetic ancestry has led to their conflation, including in genomic research settings where this may have implications for clinical care and the public conversation. Further, a lack of consistent practices could lead to health disparities being attributed to genetic causes instead of social or structural ones, researchers affiliated with the US National Heart, Lung, and Blood Institute's TOPMed program added.

In a commentary appearing in Cell Genomics on Tuesday, the researchers presented a set of recommendations for using race, ethnicity, and ancestry data transparently and responsibly when describing and analyzing genome-wide data. While the recommendations were developed for the TOPMed program, which includes more than 80 studies in the US and internationally, the authors said they could be applicable to genetic research in diverse populations more widely.

"Our prior experiences working with human genomics consortia and discussion of relevant literature and media led us to establish recommendations for TOPMed researchers that address the challenges of working with diverse data and incorporate anti-racist principles into the research process," senior author Sarah Nelson from the University of Washington and her colleagues wrote in their commentary.

Investigators should be clear regarding the terminology they use in their studies and note how the labels they use were ascertained, the authors advised. For instance, they use race and ethnicity to refer to non-biological social categories, and genetic ancestry to describe genetic origins. At the same time, investigators should note if the source of that information is non-genetic, reported information — as well as who reported that information, participants themselves or others — or if that information is genetically inferred.

Also, the data reported should be as granular as possible. If participants could choose from descriptions such as Chinese American or Pakistani, for instance, those categories should be retained as much as possible rather than consolidated into a single Asian group.

Investigators should further clearly explain why race, ethnicity, or ancestry were used as variables in their analyses, the authors said. They should additionally consider whether any effect they observe could instead be due to correlation with non-genetic social factors and examine those effects directly, if possible.

Genome-wide association studies commonly adjust for genetic ancestry in order to reduce false positives, the TOPMed team noted. They suggested that principal component analysis or admixture analysis might be more appropriate approaches to gauge genetic ancestry instead of using demographic variables. The researchers also advised against using race or ethnicity as a proxy for genetic ancestry or vice versa, noting that individuals who identify as the same race or ethnicity could have a range of genetic ancestries, and individual with similar genetic ancestries could identify as different races or ethnicities.

When reporting their results, investigators should be cognizant of the broader social contexts of their findings, especially when discussing healthcare disparities. They should also avoid adding to the notion that race and ethnicity are genetic concepts and generalizing from a smaller to a broader population.

"We recognize that addressing race, ethnicity, and ancestry in genetics research is a nuanced practice with changing perspectives," Nelson and colleagues wrote. "There is much to learn on how best to appropriately consider social factors in genetics research and translation and ensure that we dismantle any remnants of racialized thinking from this work."

The Scan

Cystatin C Plays Role in Immunosuppression, Cancer Immunotherapy Failure, Study Finds

A study in Cell Genomics provides insight into how glucocorticoids can lead to cancer immunotherapy failure via cystatin C production.

Aging, Species Lifespan Gene Expression Signatures Overlap

An Osaka Metropolitan University team reports in Nucleic Acids Research that transcriptional signatures of aging and maximum lifespan have similarities.

Splicing Subgroup Provides Protocols for Evaluating Splicing Variant Data

The group presents their approach on how to apply evidence codes to splicing predictions and other data in the American Journal of Human Genetics.

Single-Cell Transcriptomic Atlas of Mouse Cochlea to Aid Treatment Development

Researchers in PNAS conducted single-cell and single-nuclear sequencing of about 120,000 cells at three key timepoints in cochlear development to generate a transcriptomic atlas.