NCI Data Jamboree (Resources)

Data and Tools

A variety of cancer research data types/datasets are available for access, integration, and analysis, including but not limited to: genomics, transcriptomics; imaging; spatial omics; proteomics; metabolomics; clinical; real world data (RWD); cancer registry; population and epidemiology data. Interested participants can use open access, managed access, and controlled access data for their projects.

Data Discovery:

Participants who have been funded to generate/collect datasets to be used for the jamboree projects will have direct access rights.

Secondary users can search and request access to data from several entry points:

This jamboree allows access to and use of datasets through different mechanisms, including open, registered, or controlled access, depending on the data types at the raw and derived (processed) levels.  

For controlled access data requests (DARs) through dbGaP, refer to the dbGaP Authorized Access System instructions. NCI's Office of Data Sharing (ODS) created several Data Collections to streamline the access process and alleviate the burden on investigators. In this case, investigators can submit DARs for approval for the following Collections of datasets, each comprising many studies. Once approved to access the Collection(s), the investigators can access all datasets in the Collections below without having to submit additional DARs for individual datasets.

  • NCI's Collection of Datasets for General Research Use (phs003014)
  • NCI's Collection of Datasets for Health, Medical, and Biomedical (phs003044)
  • NCI's Collection of Datasets for General Cancer Research (phs003967)
  • NCI's Collection of Datasets for Pediatric Cancer Research (phs003964)

Computational Tools and Resources. Participants may leverage resources from NCI's programs, such as Cancer Research Data Commons (CRDC)'s Seven Bridges Cancer Genomics Cloud, tools from CCDI, HTAN, Informatics Technology for Cancer Research (ITCR), CPTAC, IDC, and other NCI program resources, or their own institutional computing environment and tools to download data for on-premises analysis or to bring data into a cloud environment before the event.