NCI Office of Data Sharing (ODS) Data Jamboree (Abstract Submissions): Submission #41
Submission information
Submission Number: 41
Submission ID: 145273
Submission UUID: e3c93020-4ccd-4e9c-87e2-7106df727322
Submission URI: /nci/ods-data-jamboree/abstractsubmissions
Submission Update: /nci/ods-data-jamboree/abstractsubmissions?token=xE-rkhWZsA-7HzxgA78uRlhOCHheJQjOtgL9eouAwgg
Created: Tue, 06/24/2025 - 11:46
Completed: Tue, 06/24/2025 - 11:47
Changed: Wed, 06/25/2025 - 10:36
Remote IP address: 10.208.28.5
Submitted by: Anonymous
Language: English
Is draft: No
Presenter Information --------------------- First Name: Minghong Middle Initial: {Empty} Last Name: Ward Degree(s): MS. Electrical and Computer Engineering Position/Title/Career Status: Product Owner for dbGaP FHIR Product Organization: NLM/NCBI Organization Address: BETHESDA Email: wardming@nih.gov Other (Please Specify): {Empty} Abstract Information -------------------- Abstract Category: Methods to enable data interoperability Abstract Keywords: dbGaP, FHIR, DRS, Cloud-computing, Interoperability Abstract Title: Making dbGaP data interoperable and analysis-ready with FHIR and DRS API Abstract Summary: The NIH’s database of Genotypes and Phenotypes (dbGaP) includes data from 3,000+ studies across 800 diseases/focuses, involving 4 million participants and 450,000+ phenotype variables. dbGaP has supported over 8,000 publications. As researchers increasingly rely on cloud platforms to conduct cross-study analysis, both interoperability and ease of access are urgently needed. We developed FHIR (Fast Healthcare Interoperability Resources) API for dbGaP to deliver both open-access and controlled-access dbGaP data via FHIR. The open-access API provides programmatic access to study metadata, enabling researchers to discover relevant datasets for data discovery. Controlled-access API deliver over 1.1 billion phenotypic observations and molecular sequence files through persistent URLs using the GA4GH Data Repository Service (DRS), another global standard. We will show how a simple Python script in a Jupyter notebook can perform phenotype-driven statistical analysis across multiple datasets and repositories using the FHIR API. This approach enhances data reuse, facilitates cohort building, and helps accelerate reproducible research at scale. Upload Abstract: https://events.cancer.gov/sites/default/files/webform/nci_office_of_data_sharing_abstr/145273/KidsFirst_ODS_poster_June2025.docx