2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #2
Submission information
Submission Number: 2
Submission ID: 179955
Submission UUID: d6b0ef6f-1cd3-4413-9058-74da1e4fa1d0
Submission URI: /egrp/seqspaceabstracts
Submission View: /node/2144/webform/submissions/179955?token=SKhiJ3CVFF3VKsXr9hbNMqUNGpwRMcJA8Epu6sxn4fw
Submission Update: /egrp/seqspaceabstracts?token=SKhiJ3CVFF3VKsXr9hbNMqUNGpwRMcJA8Epu6sxn4fw
Created: Tue, 05/19/2026 - 12:27
Completed: Tue, 05/19/2026 - 12:27
Changed: Tue, 05/19/2026 - 12:27
Remote IP address: 10.208.24.240
Submitted by: Anonymous
Language: English
Is draft: No
Webform: seqspace (Abstracts)
| First Name | Haoyu |
|---|---|
| Middle Initial | |
| Last Name | Zhang |
| Degree(s) | Ph.D. |
| Position/Title/Career Status | Earl Stadtman Tenure-Track Investigator |
| Organization | National Cancer Institute |
| haoyu.zhang2@nih.gov | |
| Abstract Title | Integrating Common and Rare Variants to Improve Genetic Risk Prediction Across Diverse Populations |
| Abstract Summary | Background: Polygenic risk scores (PRSs) are increasingly used to stratify risk in population and cancer epidemiology, but most rely on common variants and may miss rare sequencing variants that have large effects in a subset of individuals. We developed RICE (polygenic Risk predictions Integrating Common and rarE variants), a framework that combines common- and rare-variant information for more inclusive genetic risk prediction. Methods: RICE builds a common-variant PRS by ensembling leading PRS methods and builds a rare-variant PRS by testing functionally annotated gene-level variant sets, collapsing significant sets into burden scores, and combining them with penalized regression. We evaluated RICE in simulations and in UK Biobank and All of Us sequencing data, including up to 740 million variants from 361,939 unrelated participants across African, Admixed American/Latino, European, Middle Eastern, and South Asian ancestries. Analyses covered 11 traits, including lipid levels, height, body mass index, breast cancer, coronary artery disease, and type 2 diabetes. Results: In simulations, RICE detected rare-variant signals and improved prediction across ancestries. In real data, the common-variant component consistently matched or outperformed leading PRS methods. Adding rare variants yielded the clearest gains for traits with established rare-variant architecture, especially lipid traits and height. For lipid traits, incorporating rare variants increased explained variance by up to 11.2% in Europeans and 60.7% in African ancestry compared with common-variant PRS alone. Rare-variant scores also identified individuals with extreme lipid profiles who would be missed by common-variant PRS alone, and genome-wide rare-variant modeling outperformed scores restricted to established high-penetrance lipid genes. Conclusions: Sequencing-informed rare variants can add meaningful, ancestry-relevant information to PRS models, but benefits depend on trait architecture and sample size. RICE provides an open-source framework for integrating common and rare variation in large epidemiologic sequencing studies. |