Table of Contents
Fetching ...

Redefining Digital Health Interfaces with Large Language Models

Fergus Imrie, Paulius Rauba, Mihaela van der Schaar

TL;DR

It is demonstrated how LLM-based systems, with LLMs acting as agents, can utilize external tools and provide a novel interface between clinicians and digital technologies, enhancing the utility and practical impact of digital healthcare tools and AI models while addressing current issues with using LLMs in clinical settings.

Abstract

Digital health tools have the potential to significantly improve the delivery of healthcare services. However, their adoption remains comparatively limited due, in part, to challenges surrounding usability and trust. Large Language Models (LLMs) have emerged as general-purpose models with the ability to process complex information and produce human-quality text, presenting a wealth of potential applications in healthcare. Directly applying LLMs in clinical settings is not straightforward, however, with LLMs susceptible to providing inconsistent or nonsensical answers. We demonstrate how LLM-based systems can utilize external tools and provide a novel interface between clinicians and digital technologies. This enhances the utility and practical impact of digital healthcare tools and AI models while addressing current issues with using LLMs in clinical settings such as hallucinations. We illustrate LLM-based interfaces with the example of cardiovascular disease risk prediction. We develop a new prognostic tool using automated machine learning and demonstrate how LLMs can provide a unique interface to both our model and existing risk scores, highlighting the benefit compared to traditional interfaces for digital tools.

Redefining Digital Health Interfaces with Large Language Models

TL;DR

It is demonstrated how LLM-based systems, with LLMs acting as agents, can utilize external tools and provide a novel interface between clinicians and digital technologies, enhancing the utility and practical impact of digital healthcare tools and AI models while addressing current issues with using LLMs in clinical settings.

Abstract

Digital health tools have the potential to significantly improve the delivery of healthcare services. However, their adoption remains comparatively limited due, in part, to challenges surrounding usability and trust. Large Language Models (LLMs) have emerged as general-purpose models with the ability to process complex information and produce human-quality text, presenting a wealth of potential applications in healthcare. Directly applying LLMs in clinical settings is not straightforward, however, with LLMs susceptible to providing inconsistent or nonsensical answers. We demonstrate how LLM-based systems can utilize external tools and provide a novel interface between clinicians and digital technologies. This enhances the utility and practical impact of digital healthcare tools and AI models while addressing current issues with using LLMs in clinical settings such as hallucinations. We illustrate LLM-based interfaces with the example of cardiovascular disease risk prediction. We develop a new prognostic tool using automated machine learning and demonstrate how LLMs can provide a unique interface to both our model and existing risk scores, highlighting the benefit compared to traditional interfaces for digital tools.
Paper Structure (16 sections, 10 figures, 3 tables)

This paper contains 16 sections, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Clinicians have previously needed to interact directly with digital tools, such as risk scores. While others have discussed LLMs replacing existing clinical tools (1,2), we envisage LLMs forming a novel interface by enabling dynamic interactions and facilitating deeper engagement with tools and related information, such as model explainability, medical papers, and clinical guidelines (3).
  • Figure 2: Calibration curves. Calibration curves for our approach (AP2), QRisk3, SCORE2, and Framingham score, both before and after calibrating the existing risk scores to the UK Biobank cohort. Observed risk was calculated using Kaplan-Meier estimators kaplan1958nonparametric.
  • Figure 3: Decision curve analysis. Our approach (AP2) provides greater net benefit at all thresholds than the existing risk scores (Framingham score, SCORE2, and QRisk3) and the baseline strategies (All and None).
  • Figure 4: Feature importance. SHAP values of the variables included in the AutoPrognosis model.
  • Figure 5: Overview of an LLM-based system that enables clinicians to interface with digital tools using natural language inputs. (1) The LLM is provided with the history of the interaction, including the current request. (2) Using an iterative reasoning process, the LLM decides which, if any, tools are required and with what input. (3) The LLM provides a response to the user incorporating information provided by any tools that were used.
  • ...and 5 more figures