NCI Office of Data Sharing (ODS) Data Jamboree (Abstract Submissions): Submission #29

Submission information
Submission Number: 29
Submission ID: 145182
Submission UUID: 6db1c14f-1843-455d-8af4-68fb36628733

Created: Mon, 06/23/2025 - 17:04
Completed: Mon, 06/23/2025 - 17:05
Changed: Mon, 06/23/2025 - 17:05

Remote IP address: 10.208.24.253
Submitted by: Anonymous
Language: English

Is draft: No
serial: '29'
sid: '145182'
uuid: 6db1c14f-1843-455d-8af4-68fb36628733
uri: /nci/ods-data-jamboree/abstractsubmissions
created: '1750712687'
completed: '1750712728'
changed: '1750712728'
in_draft: '0'
current_page: ''
remote_addr: 10.208.24.253
uid: '0'
langcode: en
webform_id: nci_office_of_data_sharing_abstr
entity_type: node
entity_id: '2107'
locked: '0'
sticky: '0'
notes: ''
metatag: meta
data:
  category: 'Methods to enable data interoperability'
  degree_s_: Ph.D.
  email: yin.lu@icf.com
  first_name: Yin
  keywords_abstracts: 'Pediatric cancer, AI-readiness, Cohort refinement, Biomedical data quality'
  last_name: Lu
  middle_initial: ''
  organization: ICF
  organization_address:
    address: ''
    address_2: ''
    city: Rockville
    country: ''
    postal_code: ''
    state_province: ''
  other_please_specify_: ''
  summary: |
    While large-scale childhood cancer datasets are increasingly available, researchers often struggle to determine which cohorts are suitable for AI and machine learning applications due to inconsistencies in data quality, completeness, and standardization. This project proposes a modular framework CC-CARE-AI (Childhood Cancer Cohort Assessment and REfinement for AI) to assess and refine the AI-readiness of childhood cancer cohorts from the Gabriella Miller Kids First Program (Kids First), TARGET, and the Childhood Cancer Data Initiative (CCDI). CC-CARE-AI generates domain-specific readiness scores across clinical, genomic, and imaging data using a transparent, multi-criteria evaluation system. To complement the framework, it also incorporates tools for cohort refinement and decision support through interactive visualizations and dashboards, enabling researchers to identify high-quality subsets and enhance data usability. By aligning data quality with specific research and machine learning needs, the framework facilitates more effective and responsible use of AI in pediatric oncology.
    The project will use Python, R, and tools like pandas and Streamlit, etc. on the Seven Bridges Cancer Genomics Cloud for secure, scalable, and reproducible analysis.

    The project team will be lead by Dr. Yin Lu (Lead Bioinformatics Analyst) and includes Mr. Alexander Pilozzi (Bioinformatics Analyst), and Dr. Alejandro M. Sevillano (Bioinformatics Analyst) from the Health Analytics and Research Technologies division at ICF, with expertise in cancer data management, AI-readiness assessment, and cloud-based analysis. The team brings relevant experience from the CPTAC program, NIDDK Data Centric Challenge, ARPA-H Biomedical Data Fabric, and CRDC integration efforts.
  title: 'Lead Bioinformatics Analyst'
  ttile: 'CC-CARE-AI: A Framework for Assessing and Refining AI-Readiness of Childhood Cancer Cohorts from Kids First, TARGET, and CCDI'
  upload_abstract: '65604'