Data Resources and Project Pre-work
Childhood Cancer Datasets and Computational Resources:
- A variety of data types can be used or integrated, including but are not limited to: genomics, transcriptomics; imaging; spatial omics; proteomics; metabolomics; clinical; real world data; cancer registry; population and epidemiology data.
- Use of open access, managed access, and controlled access data is allowed. Examples of childhood cancer databases or data hubs include but are not limited to TARGET; NCCR Data Platform; CCDI Data Hub; Kids First; dbGAP and GEO. See a list of childhood cancer controlled-access example datasets in dbGaP.
- Project ideas can be submitted as long as the Project Teams use datasets that are publicly accessible to other participants, such as those listed above, alone or in combination with their own datasets and that jamboree summaries and outputs can be publicly shared. • Projects may include procedures and methods to build specific disease cohorts from clinical features, develop or refine analysis tools, pipelines, visualization techniques, and AI/ML algorithms; employ statistical methods or existing computational, mathematical, or informatics tools to address scientific questions; improve data interoperability, such as bringing public data to analyze with lab-generated data; and/or ways to generate tutorial pipelines and educational tools, data storytelling, infographics, and other creative uses of childhood cancer data.
- Participants may leverage NCI's Cancer Research Data Commons (CRDC) cloud resources for cloud computing (e.g., Seven Bridges Cancer Genomics Cloud, ISB Cancer Gateway Cloud), NCCR data platform, cBioPortal for Cancer Genomics, etc. or download data for analysis.
Project Selection, Team Orientation and Pre-work:
- Once projects submitted have been reviewed by the jamboree planning committee comprising of experts in oncology, data science and other relevant fields based on established criteria, submitters will be notified to confirm their availability and interest by early July.
- The planning committee in coordination with CRDC, NCCR, Data Access Committee (DAC), and other support staff will engage in regularly recurring virtual orientation and pre-work sessions in the summer of 2025 (July - September). The goal is to prepare and orient selected teams with documentations, tutorials, and codes for account set up, data access, cloud computing, and other necessary work so that participants will come to the in-person event ready to analyze data. Preparatory work could include how to access managed and controlled data for the project teams facilitated by the ODS DAC. A citizen scientist and/or a graduate student who are ineligible to request controlled-access data through dbGaP can team up with a principal investigator or an equivalent to gain access to such data.
- Due to the time limitation for the in-person event, selected project teams are highly encouraged to start their projects before and/or during the two-month virtual preparatory sessions. Project teams are required to complete data downloads or bring in all data into their virtual cloud computing environment prior to the event.
- Although completion of data analysis or tool development is not required, summaries of observations, codes, methods, and feedback on data processing, data access, data analysis tools, suggestions of improvement for relevant areas, etc. are required.
Questions?
For questions about the data Jamborees in general or about the application process, please email Emily Boja from NCI's Office of Data Sharing (emily.boja@nih.gov).