2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE)
4 submissions
| # | Starred | Locked | Notes | Created | User | IP address | First Name | Middle Initial | Last Name | Degree(s) | Position/Title/Career Status | Organization | Abstract Title | Abstract Summary | Operations | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | Star/flag 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #4 | Lock 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #4 | Add notes to 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #4 | Fri, 05/22/2026 - 13:47 | Anonymous | 10.208.28.95 | Ryan | L | Collins | PhD | Instructor | Dana-Farber Cancer Institute | ryan_collins@dfci.harvard.edu | Diverse mediators of cancer predisposition uncovered by germline whole genome sequencing of unexplained familial cancers | Cancer frequently clusters in families due to shared environment and genetics. However, many familial cancer cases lack a clinically recognized pathogenic germline variant (PGV). We analyzed germline genomes and family history from 2,726 individuals without a PGV in the All of Us Research Program, including 1,496 cases across 18 cancer types with extensive family history and 1,230 family history-negative, cancer-free controls. We identified allelic series of rare structural variants inactivating MSH2 in individuals with phenotypes consistent with Lynch syndrome and BRCA1 in breast cancer. Cancer polygenic risk scores were enriched in cases and correlated with patterns of cancer diagnoses within families. Exome-wide rare variant analyses nominated six candidate predisposition genes, including TSTD2 and BRAT1 in thyroid and breast cancer, respectively. Overall, polygenic risk and rare variants impacting known genes explained a median of 5% of unexplained familial cancers, increasing to 11% when including newly nominated risk factors. | |
| 3 | Star/flag 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #3 | Lock 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #3 | Add notes to 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #3 | Thu, 05/21/2026 - 10:15 | Anonymous | 10.208.28.250 | Tony | Chen | PhD | Postdoctoral Research Fellow | Massachusetts General Hospital | chentony@broadinstitute.org | SPLENDID incorporates continuous genetic ancestry in biobank-scale data to improve polygenic risk prediction across diverse populations | Polygenic risk scores are widely used in disease risk stratification, but their accuracy varies across different ancestries. Recent methods leverage multi-ancestry data to improve accuracy in under-represented populations but require labelling individuals by ancestry. This poses practical challenges, as clinical decisions are typically not based on ancestry, and many individuals may not fit into a pre-specified ancestry group. We propose SPLENDID, a penalized regression framework for large-scale individual-level data that models genetic ancestry as a continuum to produce a unified prediction model without any ancestry labels. In extensive simulations and analysis in the All of Us Research Program (N=224,364) and UK Biobank (N=340,140), we show that SPLENDID significantly improved prediction accuracy over existing methods, particularly in non-European and admixed ancestries. By modeling genetic interactions with continuous ancestry, we further identified ancestry-differential effects in lipid and blood cell phenotypes that may explain limited transferability of existing PRS methods across ancestry groups. Finally, using a logistic regression extension of SPLENDID improved prediction of breast and prostate cancer by 6% and 9%, respectively, compared to current state-of-the-art PRS. Altogether, SPLENDID stands as a valuable tool for robust risk prediction across diverse populations, reduced health disparities in genetic research, and fairer clinical implementation. | ||
| 2 | Star/flag 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #2 | Lock 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #2 | Add notes to 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #2 | Tue, 05/19/2026 - 12:27 | Anonymous | 10.208.24.240 | Haoyu | Zhang | Ph.D. | Earl Stadtman Tenure-Track Investigator | National Cancer Institute | haoyu.zhang2@nih.gov | Integrating Common and Rare Variants to Improve Genetic Risk Prediction Across Diverse Populations | Background: Polygenic risk scores (PRSs) are increasingly used to stratify risk in population and cancer epidemiology, but most rely on common variants and may miss rare sequencing variants that have large effects in a subset of individuals. We developed RICE (polygenic Risk predictions Integrating Common and rarE variants), a framework that combines common- and rare-variant information for more inclusive genetic risk prediction. Methods: RICE builds a common-variant PRS by ensembling leading PRS methods and builds a rare-variant PRS by testing functionally annotated gene-level variant sets, collapsing significant sets into burden scores, and combining them with penalized regression. We evaluated RICE in simulations and in UK Biobank and All of Us sequencing data, including up to 740 million variants from 361,939 unrelated participants across African, Admixed American/Latino, European, Middle Eastern, and South Asian ancestries. Analyses covered 11 traits, including lipid levels, height, body mass index, breast cancer, coronary artery disease, and type 2 diabetes. Results: In simulations, RICE detected rare-variant signals and improved prediction across ancestries. In real data, the common-variant component consistently matched or outperformed leading PRS methods. Adding rare variants yielded the clearest gains for traits with established rare-variant architecture, especially lipid traits and height. For lipid traits, incorporating rare variants increased explained variance by up to 11.2% in Europeans and 60.7% in African ancestry compared with common-variant PRS alone. Rare-variant scores also identified individuals with extreme lipid profiles who would be missed by common-variant PRS alone, and genome-wide rare-variant modeling outperformed scores restricted to established high-penetrance lipid genes. Conclusions: Sequencing-informed rare variants can add meaningful, ancestry-relevant information to PRS models, but benefits depend on trait architecture and sample size. RICE provides an open-source framework for integrating common and rare variation in large epidemiologic sequencing studies. |
||
| 1 | Star/flag 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #1 | Lock 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #1 | Add notes to 2026 Sequencing Strategies for Population and Cancer Epidemiology Studies (SeqSPACE) : Submission #1 | Mon, 03/23/2026 - 13:51 | Anonymous | 10.208.28.96 | Gustavo | A | Mendoza Fandino | Ph.D. | Postdoctoral Fellow | Monteiros'Lab/Moffitt Cancer Center | gustavo.menndoza-fandino@moffitt.org | Elucidating the Molecular Basis of Testicular Cancer Susceptibility Using Integrated GWAS, TWAS, and Functional Genomic Annotation | Testicular germ cell tumor (TGCT) is the most common cancer in young adult individuals and exhibits one of the highest heritability estimates among solid tumors. Familial aggregation studies consistently indicate a strong genetic component to TGCT susceptibility, with risk pathways enriched for cell‑cycle regulation, chromosome segregation, and DNA repair mechanisms. Although genome‑wide association studies (GWAS) and transcriptome‑wide association studies (TWAS) have identified numerous germline risk loci, the underlying biological mechanisms and causal variants remain poorly defined. Methods: We implemented an integrative analytic framework to functionally annotate TGCT risk regions identified through GWAS and TWAS. For each region, we defined a set of credible variants using LD ≥0.8 with the lead SNP. These variants were evaluated using histone‑mark profiles, chromatin accessibility, regulatory element predictions, and long‑range chromatin interaction datasets (Hi‑C) to assess potential enhancer or promoter activity. This enhancer‑focused screen was applied uniformly across all loci; however, each region was additionally examined for alternative mechanisms, including post‑transcriptional regulation. Results: The chromosome 7 locus illustrates how this approach resolves locus‑specific molecular mechanisms. TWAS and GWAS jointly prioritized SP4 as a candidate gene. Within the credible set, we identified rs7798894, located in the SP4 3′UTR, as the most plausible functional variant. rs7798894 alters a predicted binding site for hsa‑miR‑4282, a microRNA reported to exhibit tumor‑suppressive activity. The T allele (frequency ~0.72) creates a functional miRNA seed site, whereas the A allele (frequency ~0.28) disrupts it, suggesting allele‑specific SP4 regulation. Although enhancer annotations were surveyed, the miRNA‑mediated mechanism provided the strongest functional explanation for this locus. Conclusions: Our integrative GWAS–TWAS framework enables locus‑specific mechanistic inference and highlights post‑transcriptional miRNA targeting as a driver of risk at the chromosome 7 SP4 locus. This approach improves causal variant identification, informs biologically grounded polygenic risk score development, and advances mechanistic understanding of TGCT susceptibility. |