NCI Office of Data Sharing (ODS) Data Jamboree (Abstract Submissions): Submission #41
Submission information
Submission Number: 41
Submission ID: 145273
Submission UUID: e3c93020-4ccd-4e9c-87e2-7106df727322
Submission URI: /nci/ods-data-jamboree/abstractsubmissions
Submission Update: /nci/ods-data-jamboree/abstractsubmissions?token=xE-rkhWZsA-7HzxgA78uRlhOCHheJQjOtgL9eouAwgg
Created: Tue, 06/24/2025 - 11:46
Completed: Tue, 06/24/2025 - 11:47
Changed: Wed, 06/25/2025 - 10:36
Remote IP address: 10.208.28.5
Submitted by: Anonymous
Language: English
Is draft: No
Presenter Information
Minghong
{Empty}
Ward
MS. Electrical and Computer Engineering
Product Owner for dbGaP FHIR Product
NLM/NCBI
BETHESDA
{Empty}
Abstract Information
Methods to enable data interoperability
dbGaP, FHIR, DRS, Cloud-computing, Interoperability
Making dbGaP data interoperable and analysis-ready with FHIR and DRS API
The NIH’s database of Genotypes and Phenotypes (dbGaP) includes data from 3,000+ studies across 800 diseases/focuses, involving 4 million participants and 450,000+ phenotype variables. dbGaP has supported over 8,000 publications. As researchers increasingly rely on cloud platforms to conduct cross-study analysis, both interoperability and ease of access are urgently needed. We developed FHIR (Fast Healthcare Interoperability Resources) API for dbGaP to deliver both open-access and controlled-access dbGaP data via FHIR. The open-access API provides programmatic access to study metadata, enabling researchers to discover relevant datasets for data discovery. Controlled-access API deliver over 1.1 billion phenotypic observations and molecular sequence files through persistent URLs using the GA4GH Data Repository Service (DRS), another global standard. We will show how a simple Python script in a Jupyter notebook can perform phenotype-driven statistical analysis across multiple datasets and repositories using the FHIR API. This approach enhances data reuse, facilitates cohort building, and helps accelerate reproducible research at scale.