Researchers Use Multiomic Data to Generate Genetic Scores to Predict Disease, Traits

Mar 30, 2023 | Aayushi Pratap

NEW YORK – Researchers from the University of Cambridge and elsewhere have developed genetic scores to predict complex human traits from multiomic data and validated these scores across cohorts of individuals of European, Asian, and African American ancestries.

Multiomic tools capture a range of data — transcriptomic, proteomic, metabolomic, and more — and are key to understanding the etiology of diseases. However, such analysis is expensive and time-consuming.

"Many low-resource settings in low-income countries don't have any multiomics data," said co-corresponding author Michael Inouye, director of research at the department of public health and primary care at Cambridge. "Our findings are important as they democratize multiomics data and make it possible for everyone to benefit," he added.

For their study published in Nature on Wednesday, Inouye and colleagues used data from the INTERVAL study, which collected serum or plasma samples from participants and performed assays using five omics platforms to generate proteomic, metabolomic, and transcriptomic data: SomaScan, Olink Target, Metabolon HD4, Nightingale, and whole-blood RNA sequencing with the Illumina NovaSeq 6000. These participants were also genotyped, and, after quality control, 10,572,788 genetic variants were available. Using machine learning, the researchers developed genetic scores for 17,227 biomolecular traits and 10,521 predictions that reached Bonferroni-adjusted significance.

Next, the researchers validated these genetic scores in various cohorts of people of East Asian, South Asian, African American, and European ancestries.

"Overall, we found that genetic scores developed in INTERVAL could predict the levels of Nightingale and SomaScan traits in individuals of Asian or African American ancestry, but, as expected, the performances of these scores were significantly reduced relative to European-ancestry cohorts," the authors wrote in their paper.

The researchers used their approach to generate a synthetic multiomic dataset for the UK Biobank, which was then used in a phenome-wide association study (PheWAS) using PheCodes.

They identified 18,404 associations between genetic scores for the various traits and 18 categories of PheCodes. Circulatory, endocrine, metabolic, and digestive diseases yielded the largest number of associations across platforms, according to the researchers.

The PheWAS study was also able to detect many known blood biomarkers of disease as well as other notable associations. For example, total cholesterol was significantly associated with myocardial infarction, and genetically predicted levels of IL-6R in both the Olink and SomaScan datasets were significantly associated with myocardial infarction, the researchers found.

The researchers noted that even genetic scores of apparently low predictive value may be powerful enough to detect true associations at the sample sizes of current and forthcoming biobanks.

But highlighting the limitations of the study, Inouye said that the training sets for the machine learning model need to have data representing individuals from various demographics and ancestries. "Only this will lead to more equitable analysis and findings," he added.

The researchers have compiled their findings in an open resource portal called OmicsPred. "Although OmicsPred provides a key first step towards a better understanding of the distributions of clinically or therapeutically important biomarkers under high genetic control, more research is needed to understand to what extent genetic scores for multiomic traits may one day be of clinical use," the authors wrote.

Filed under

Genetic Research

Gene Expression & RNA Sequencing

Proteomics & Protein Research

Journal Study

multiomics

RNA-seq

proteomics

metabolomics

University of Cambridge

Europe

Researchers Use Multiomic Data to Generate Genetic Scores to Predict Disease, Traits

Filed under

Helix, St. Luke's University Health Network Partner on Population Health Study

Exome Sequencing Study Suggests Early Menopause May Not Be Monogenic

New Products Posted to GenomeWeb: Tempus, GenScript, RealSeq Biosciences, 23andMe

People in the News at Biocept, Arcensus, TATAA Biocenter, Quantum-Si, LetsGetChecked, More

In Brief This Week: Qiagen, Mainz Biomed, Quest Diagnostics, Haystack Oncology, Becton Dickinson

Cystatin C Plays Role in Immunosuppression, Cancer Immunotherapy Failure, Study Finds

Aging, Species Lifespan Gene Expression Signatures Overlap

Splicing Subgroup Provides Protocols for Evaluating Splicing Variant Data

Single-Cell Transcriptomic Atlas of Mouse Cochlea to Aid Treatment Development

Illumina Ventures to Abandon Cohort Model With Revamped Accelerator Program
Premium

Space Travel Shifts White Blood Cell Expression in Transcriptomic Study

ACMG Adds Three New Genes to Secondary Findings List

Recent Tech Advancements Provide Large Leap Forward for Mass Spec-Based Plasma Proteomics
Premium

Exome Sequencing Study Suggests Early Menopause May Not Be Monogenic

Researchers Use Multiomic Data to Generate Genetic Scores to Predict Disease, Traits

Filed under

Illumina Ventures to Abandon Cohort Model With Revamped Accelerator Program Premium

Recent Tech Advancements Provide Large Leap Forward for Mass Spec-Based Plasma Proteomics Premium

Illumina Ventures to Abandon Cohort Model With Revamped Accelerator Program
Premium

Recent Tech Advancements Provide Large Leap Forward for Mass Spec-Based Plasma Proteomics
Premium