NCI Data Jamboree (Project Abstract Submission): Submission #1

Submission information
Submission Number: 1
Submission ID: 181737
Submission UUID: eb259b57-46c5-4132-a402-c604065315c1

Created: Fri, 05/29/2026 - 09:31
Completed: Fri, 05/29/2026 - 09:41
Changed: Mon, 06/01/2026 - 16:51

Remote IP address: 10.208.28.199
Submitted by: Anonymous
Language: English

Is draft: No
Presenter Information
Ying
{Empty}
huang
MD
HSA
NCI
Rockville
Additional Authors
  • First Name: Min
    Last Name: Zhang
    Affiliation: UCI
  • First Name: Linel
    Last Name: Demar
    Affiliation: Drew University
Abstract Information
Developing tutorials, workbooks, infographics, or creative use of data for educational and engagement purposes
Keywords: Findability; Accessibility; Governance Interoperability; Data Reuse Workflow Observability; NCI Repositories
A Pilot Discovery Friction Framework for Quantifying Research Initiation Burden Across Federated Oncology Data Ecosystems
The rapid expansion of federated oncology ecosystems has increased controlled-access biomedical datasets, but translational investigators frequently encounter fragmented discovery pathways and complex governance workflows. While essential for participant privacy, the operational burden of these systems remains poorly characterized. This project will develop and pilot the Discovery Friction Framework, a human-centered observability framework designed to quantify "research initiation burden" across federated ecosystems.

The framework employs structured workflow instrumentation—including screen recording, event logging, and rubric-based telemetry extraction—to capture objective operational metrics across multiple domains (discovery burden, authentication complexity, workflow instability, governance complexity, and temporal burden). Metrics include portal transitions, unresolved discovery paths, authentication redirection chains, manual workarounds, Data Access Request (DAR) revision cycles, and time-to-access intervals.

To test the framework against realistic discovery pathways, the project will identify high-value datasets across multiple modalities from published secondary data analyses utilizing dbGaP (genomics), NCCR (clinical), and CRDC IDC (imaging).

By characterizing translational workflow complexity through real-world investigator interactions, this pilot will generate foundational telemetry primitives, workflow observability methods, and evidence-based insights that may support future scalable observability strategies across biomedical data ecosystems. The project will provide actionable guidance for improving pathway transparency, governance intelligibility, and translational data access coordination across the NCI data ecosystem.

Implementation Requirements:
The project requires a standard web-based testing environment (no high-performance computing required). Software tools include open-source user-session logging, screen-capture instrumentation, and text-mining packages (Python/R) to structure qualitative rubrics into quantitative dataframes. The project relies on a cross-disciplinary team featuring: Human-Centered Design/UX Researchers to build telemetry rubrics; Data Governance Specialists and Data Access Committee (DAC) members familiar with dbGaP and NIH data access mechanisms to map workflow pathways; and Front-End/Data Engineers to develop the underlying schema for the friction telemetry database.