Table of Contents
Fetching ...

CliMB: An AI-enabled Partner for Clinical Predictive Modeling

Evgeny Saveliev, Tim Schubert, Thomas Pouplin, Vasilis Kosmoliaptsis, Mihaela van der Schaar

TL;DR

CliMB is introduced, a no-code AI-enabled partner designed to empower clinician scientists to create predictive models using natural language and provides a no-code interface with clear guidance and access to SOTA methods in the fields of data-centric AI, AutoML, and interpretable ML.

Abstract

Despite its significant promise and continuous technical advances, real-world applications of artificial intelligence (AI) remain limited. We attribute this to the "domain expert-AI-conundrum": while domain experts, such as clinician scientists, should be able to build predictive models such as risk scores, they face substantial barriers in accessing state-of-the-art (SOTA) tools. While automated machine learning (AutoML) has been proposed as a partner in clinical predictive modeling, many additional requirements need to be fulfilled to make machine learning accessible for clinician scientists. To address this gap, we introduce CliMB, a no-code AI-enabled partner designed to empower clinician scientists to create predictive models using natural language. CliMB guides clinician scientists through the entire medical data science pipeline, thus empowering them to create predictive models from real-world data in just one conversation. CliMB also creates structured reports and interpretable visuals. In evaluations involving clinician scientists and systematic comparisons against a baseline GPT-4, CliMB consistently demonstrated superior performance in key areas such as planning, error prevention, code execution, and model performance. Moreover, in blinded assessments involving 45 clinicians from diverse specialties and career stages, more than 80% preferred CliMB over GPT-4. Overall, by providing a no-code interface with clear guidance and access to SOTA methods in the fields of data-centric AI, AutoML, and interpretable ML, CliMB empowers clinician scientists to build robust predictive models. The proof-of-concept version of CliMB is available as open-source software on GitHub: https://github.com/vanderschaarlab/climb.

CliMB: An AI-enabled Partner for Clinical Predictive Modeling

TL;DR

CliMB is introduced, a no-code AI-enabled partner designed to empower clinician scientists to create predictive models using natural language and provides a no-code interface with clear guidance and access to SOTA methods in the fields of data-centric AI, AutoML, and interpretable ML.

Abstract

Despite its significant promise and continuous technical advances, real-world applications of artificial intelligence (AI) remain limited. We attribute this to the "domain expert-AI-conundrum": while domain experts, such as clinician scientists, should be able to build predictive models such as risk scores, they face substantial barriers in accessing state-of-the-art (SOTA) tools. While automated machine learning (AutoML) has been proposed as a partner in clinical predictive modeling, many additional requirements need to be fulfilled to make machine learning accessible for clinician scientists. To address this gap, we introduce CliMB, a no-code AI-enabled partner designed to empower clinician scientists to create predictive models using natural language. CliMB guides clinician scientists through the entire medical data science pipeline, thus empowering them to create predictive models from real-world data in just one conversation. CliMB also creates structured reports and interpretable visuals. In evaluations involving clinician scientists and systematic comparisons against a baseline GPT-4, CliMB consistently demonstrated superior performance in key areas such as planning, error prevention, code execution, and model performance. Moreover, in blinded assessments involving 45 clinicians from diverse specialties and career stages, more than 80% preferred CliMB over GPT-4. Overall, by providing a no-code interface with clear guidance and access to SOTA methods in the fields of data-centric AI, AutoML, and interpretable ML, CliMB empowers clinician scientists to build robust predictive models. The proof-of-concept version of CliMB is available as open-source software on GitHub: https://github.com/vanderschaarlab/climb.
Paper Structure (30 sections, 7 figures, 7 tables)

This paper contains 30 sections, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Cycle of clinical predictive modeling. In this proposed cycle, data is generated during patient care by clinicians or clinician scientists. This data can be deposited in EHRs, registries, biobanks etc. Clinician scientists then create predictive models using SOTA machine learning. Empowering them to do so is the focus of this paper. Subsequently predictive models are evaluated and then implemented in patient care. This cycle will likely be repeated to adapt to changing features/demographics or new challenges. Figure created with Biorender.com
  • Figure 2: An AI-enabled partner for clinical predictive model building: CliMB. Clinician scientists turn to CliMB with a predictive problem and real-world data. CliMB guides the clinician scientist through all phases of the data science pipeline with robust planning and SOTA tools, including the AutoML pipeline AutoPrognosis 2.0 Imrie2023-wp alongside data-centric and interpretability tools (see Table \ref{['table:available_tools']}). The clinician scientist and CliMB partner to generate a predictive model, visualizations, and a summary report of the methodology. Figure created with Biorender.com
  • Figure 3: Design of CliMB. The information graph highlights the flow of information within CliMB. The process begins with the user describing their clinical problem and uploading a corresponding medical dataset (0). The memory unit stores a working directory for tools, continuously updated plans tracking overall progress, files, and logs of the entire user interaction (g). The reasoning unit receives information from the memory (a), user (h), and through self reflection, and integrates this feedback (b) to facilitate subsequent planning (c). The action unit executes the plan (d) and generates multimodal outputs, which are stored in the memory and displayed via the user interface (e). The clinician scientist interacts with the user interface (f) and, at the end of an episode (see section \ref{['section:methods:reasoning_unit']}), validates and concludes the current phase (i). Figure created with Biorender.com
  • Figure 4: The reasoning unit modeled as an episodic multi-armed bandit. The integration of this unit in the information graph detailed in Figure \ref{['CliMB Design Figure']} is as follows: ⓪ Initial objective given by the user, ⓑ Feedback mechanism, ⓓ Calling of actions, ⓘ Validation by the user.
  • Figure 5: Illustrative usage of CliMB. Snippets of interactions during a full end-to-end model-building session are provided from all four phases of the medical data science project: (1) data exploration (shown: Exploratory Data Analysis), (2) data engineering (shown: data transformation), (3) model building (shown: AutoPrognosis 2.0 Survival Analysis Imrie2023-wp), and (4) model exploitation (shown: SHAP explainer Lundberg2017-gq for interpreting feature importance). Note: illustrative examples shown do not necessarily correspond to the sessions used in experiments. Figure created with Biorender.com
  • ...and 2 more figures