NCI Data Jamboree (Project Abstract Submission): Submission #2
Submission information
Submission Number: 2
Submission ID: 183015
Submission UUID: 83b183a5-8018-47ee-9eb4-fb7c78f571cb
Submission URI: /nci/datajamboree/abstractsubmission
Submission Update: /nci/datajamboree/abstractsubmission?token=i4p8qaLnDOOf53RZqquzZMni3hiSLvyq04N8Qvtv-po
Created: Mon, 06/08/2026 - 20:30
Completed: Mon, 06/08/2026 - 20:30
Changed: Mon, 06/08/2026 - 20:30
Remote IP address: 10.208.24.28
Submitted by: Anonymous
Language: English
Is draft: No
Webform: NCI Data Jamboree (Abstracts)
Submitted to: NCI Data Jamboree (Project Abstract Submission)
Presenter Information
Megha
B.
Srivastava
B.S./M.S. in Computer Science
PhD Student in Computer Science
Stanford University
Stanford
Additional Authors
Abstract Information
Employing statistical, computational, and informatics tools, algorithms, and methods to integrate or analyze data
machine learning, AI-readiness, distribution shift, causal inference, confounding variables, language models, LLMs
Project Seeker
I am a PhD student in Computer Science, with significant experience in machine learning, large language modeling, and human-AI interaction. I have recently been transition my research towards applications of AI in medicine, healthcare, and drug discovery, and hope to understand what challenges exist on the dataset-level, and what are ideal datasets that can help push different problems forward. One research area I am particularly interested in is challenges of distribution shift -- e.g. mismatch between the training dataset and test time inference, and how to tackle that. I am particularly curious about methods for identifying potential confounding variables that are unmeasured in the current dataset. My hope is to join a project that can help improve the quality and availability of oncology datasets for machine learning research.