NCI's Office of Data Sharing Annual Symposium (Abstract Submission): Submission #3

Submission information
Submission Number: 3
Submission ID: 144018
Submission UUID: 2420c124-9aa2-44be-9c03-e476cf370257

Created: Wed, 06/04/2025 - 16:58
Completed: Wed, 06/04/2025 - 16:58
Changed: Wed, 06/04/2025 - 16:58

Remote IP address: 10.208.28.130
Submitted by: Anonymous
Language: English

Is draft: No





Presenter Information
---------------------






First Name: Jeff









Middle Initial: {Empty}









Last Name: Liu









Degree(s): MSc.









Position/Title/Career Status: Director, Data Management and Strategy









Organization: Dana-Farber Cancer Institute









Organization Address:
Boston










Email: J_Liu@dfci.harvard.edu













Abstract Information
--------------------






Abstract Category: Poster Abstract









Abstract Keywords: cancer, data, sharing, catalog, metadata









Abstract Title: Empowering Cancer Research Discovery: An Institutional Data Catalog for Enhanced Data Sharing, Findability, and Accessibility









Abstract Summary:
The exponential growth of scientific data in cancer research and precision medicine presents substantial challenges for researchers striving to access and utilize diverse data resources. These challenges stem from data fragmentation across multiple databases and repositories, compounded by a lack of standardization in formats and metadata. Moreover, institutional data silos further impede collaboration and comprehensive analyses. To address these obstacles, the National Cancer Institute’s Childhood Cancer Data Initiative (CCDI) has developed the Childhood Cancer Data Catalog (CCDC), a groundbreaking searchable database housing pediatric data resources shared by the pediatric cancer research community. 

Collaborating closely with the CCDC project team, the Data Management Team in the Division of Population Science at DFCI customized the open-source CCDC codebase and  built a DFCI Research Data Catalog (DFDC). The DFDC platform integrates rich metadata and high-value institutional resources, including solid tumor curation projects, the Profile cohort, Rapid Heme Panel data, cBioPortal, ImmunoProfile imaging data, the Molecular Oncology Almanac. Additionally, it incorporates numerous community datasets such as those from the NIH All of Us program, GENIE, SEER, NCI’s Cancer Research Data Commons (CRDC), and hundreds of dataset entries graciously shared by the CCDC project team. 

By providing researchers with easy access to a vast array of cancer data resources, DFDC streamlines the process of data discovery, enabling researchers to identify and access resources within hours rather than days or weeks. This enhanced accessibility facilitates more efficient secondary data analysis, thereby catalyzing advancements in cancer research. Furthermore, the DFDC platform serves as a platform for data sharing among DFCI researchers, fostering collaboration and encouraging greater data reuse within the scientific community.










Upload Abstract: {Empty}