AI Data Limitations Workshop (Agenda)


A brief summary of the planned workshop sessions is provided below. A detailed agenda with speakers and presentation titles will be shared in January.

DAY 1, April 3, 2023 (11 am to 4:30 pm EDT) 

Session 1: Integrating classical structure prediction with machine learning towards drug discovery

Session Chair: Trey Ideker, UCSD

This session will focus on expanding the field of structure prediction to incorporate multiple data modalities and layers of biological structure beyond the protein, as well as meta-learning for identifying targets for drug discovery.


Session 2: Chemical, genetic, and mechanical perturbations for understanding mechanisms in cancer: Extrapolating beyond existing data

Session Chair: Fabian Theis, Helmholtz Zentrum München

In this session, researchers will discuss the use of large-scale perturbation data for causal modeling, combining representation learning with perturbation approaches, and methods to extrapolate beyond existing perturbation data.


Session 3: Multimodal learning in data limited contexts: Leveraging tissue-level data for understanding cell-cell interactions in cancer

Session chair: Dana Pe’er, Memorial Sloan Kettering

This session will focus on multimodal learning in data limited contexts, including cell-cell interactions and predicting outcomes. Dealing with imbalances across multimodal data sets and foundational models will also be discussed.


DAY 2, April 4, 2023 (11 am to 3:30 pm EDT) 

Session 4: Making use of large-scale, structured clinical research data and image repositories

Session chair: Ziad Obermeyer, UC Berkeley

In this session, researchers will discuss the use of large-scale clinical research data for machine learning models. Discussion topics include the use of synthetic data, considerations of bias, generalizable models, and development of digital twins.


Session 5: Improving modeling of real-world evidence data in clinical research and clinical trial design

Session chair: Tianxi Cai, Harvard

This session will focus on real-world evidence (RWE) data modeling, including issues associated with RWE data such as electronic health record coding and unbalanced data, towards the development of clinical trials.


Session 6: Cross-cutting discussion with session chairs

Discussion of the approaches and challenges identified during the workshop and opportunities for the future.