NCI Data Jamboree (Project Abstract Submission): Submission #2

National Cancer Institute

Submission information

Submission Number: 2

Submission ID: 183015

Submission UUID: 83b183a5-8018-47ee-9eb4-fb7c78f571cb

Submission URI: /nci/datajamboree/abstractsubmission

Submission Update: /nci/datajamboree/abstractsubmission?token=i4p8qaLnDOOf53RZqquzZMni3hiSLvyq04N8Qvtv-po

Created: Mon, 06/08/2026 - 20:30

Completed: Mon, 06/08/2026 - 20:30

Changed: Mon, 06/08/2026 - 20:30

Remote IP address: 10.208.24.28

Submitted by: Anonymous

Language: English

Is draft: No

Webform: NCI Data Jamboree (Abstracts)

Submitted to: NCI Data Jamboree (Project Abstract Submission)

serial: '2'
sid: '183015'
uuid: 83b183a5-8018-47ee-9eb4-fb7c78f571cb
uri: /nci/datajamboree/abstractsubmission
created: '1780965024'
completed: '1780965032'
changed: '1780965032'
in_draft: '0'
current_page: ''
remote_addr: 10.208.24.28
uid: '0'
langcode: en
webform_id: nci_data_jamboree_abstracts
entity_type: node
entity_id: '2272'
locked: '0'
sticky: '0'
notes: ''
metatag: meta
data:
  list_of_additional_authors: {  }
  category: 'Employing statistical, computational, and informatics tools, algorithms, and methods to integrate or analyze data'
  degree_s_: 'B.S./M.S. in Computer Science'
  email: megha@cs.stanford.edu
  first_name: Megha
  keywords_abstracts: 'machine learning, AI-readiness, distribution shift, causal inference, confounding variables, language models, LLMs'
  last_name: Srivastava
  middle_initial: B.
  organization: 'Stanford University'
  organization_address:
    address: ''
    address_2: ''
    city: Stanford
    country: ''
    postal_code: ''
    state_province: ''
  summary: 'I am a PhD student in Computer Science, with significant experience in machine learning, large language modeling, and human-AI interaction. I have recently been transition my research towards applications of AI in medicine, healthcare, and drug discovery, and hope to understand what challenges exist on the dataset-level, and what are ideal datasets that can help push different problems forward. One research area I am particularly interested in is challenges of distribution shift -- e.g. mismatch between the training dataset and test time inference, and how to tackle that. I am particularly curious about methods for identifying potential confounding variables that are unmeasured in the current dataset. My hope is to join a project that can help improve the quality and availability of oncology datasets for machine learning research. '
  title: 'PhD Student in Computer Science'
  ttile: 'Project Seeker '