Childhood Cancer Data Initiative Annual Symposium (Abstract Registration): Submission #60
Submission information
Submission Number: 60
Submission ID: 150493
Submission UUID: 9b5a2d3c-8cd5-4e2f-a64d-0a3db5ad1be8
Submission URI: /nci/ccdisymposium/abstract
Created: Mon, 09/01/2025 - 21:13
Completed: Mon, 09/01/2025 - 21:46
Changed: Mon, 09/01/2025 - 21:46
Remote IP address: 10.208.28.30
Submitted by: Anonymous
Language: English
Is draft: No
Abstract Submission for Poster Presentation
Generalizable Pediatric Sarcoma Histopathology Classification with Multi-Institutional Machine Learning
Intro: Digitization of histopathology slides has allowed for the use of computational machine learning and artificial intelligence (AI)–based approaches to aid in diagnostics. These tools could be especially helpful for classifying pediatric sarcoma subtypes, which are rare and heterogeneous, and whose diagnoses often require costly genetic and molecular testing that may not be available to every patient. These machine learning models offer great promise but come with the caveat of being prone to overfitting to an individual institution’s microscope, scanner, and staining protocol, which can affect performance in real-world clinical settings where instruments and protocols differ across hospitals. Therefore, it can be difficult to make these models generalizable for use globally if they are not trained on a large and diverse dataset.
Methods: We have curated over 700 H&E images from four institutions spanning over 10 different sarcoma subtypes. We utilize an in-house, open-source pipeline for stain normalization, focus checking, and cropping to harmonize the images. AI models are then used to extract features from these images, which can be used for downstream SAMPLER-based machine learning.
Results: We achieve state-of-the-art results in classifying images as rhabdomyosarcoma vs non-rhabdomyosarcoma soft tissue sarcomas (AUC 0.969 ± 0.026), alveolar vs embryonal rhabdomyosarcoma (AUC 0.961 ± 0.021), and Ewing sarcoma (AUC 0.929). Importantly, our models generalize well when tested on data from previously unseen institutions, outperforming similar methods.
Conclusion: Our pipeline is well suited for additional collaboration and could be a tool to help bridge access to clinical resources globally.
Methods: We have curated over 700 H&E images from four institutions spanning over 10 different sarcoma subtypes. We utilize an in-house, open-source pipeline for stain normalization, focus checking, and cropping to harmonize the images. AI models are then used to extract features from these images, which can be used for downstream SAMPLER-based machine learning.
Results: We achieve state-of-the-art results in classifying images as rhabdomyosarcoma vs non-rhabdomyosarcoma soft tissue sarcomas (AUC 0.969 ± 0.026), alveolar vs embryonal rhabdomyosarcoma (AUC 0.961 ± 0.021), and Ewing sarcoma (AUC 0.929). Importantly, our models generalize well when tested on data from previously unseen institutions, outperforming similar methods.
Conclusion: Our pipeline is well suited for additional collaboration and could be a tool to help bridge access to clinical resources globally.
{Empty}
The Jackson Laboratory