Table of Contents
Fetching ...

LLMs Enable Context-Aware Augmented Reality in Surgical Navigation

Hamraz Javaheri, Omid Ghamarnejad, Paul Lukowicz, Gregor Alexander Stavrou, Jakob Karolus

TL;DR

The paper tackles interaction challenges in wearable AR for surgery by introducing a voice-controlled VCUI that leverages large language models. It compares a baseline speech-command VCUI to an LLM-based VCUI integrated with an AR navigation system (ARAS) designed for open pancreatic surgery, using simulated tasks and real surgeries. The LLM-based VCUI significantly reduces task completion time and cognitive workload, and surgeons express a strong preference for its natural, context-aware control, though refinements are needed for dictation reliability and feedback design. The findings highlight the potential of context-aware LLMs to enhance intraoperative decision support and suggest a hybrid interaction approach that combines the strengths of both speech commands and LLM-based control for surgical AR applications.

Abstract

Wearable Augmented Reality (AR) technologies are gaining recognition for their potential to transform surgical navigation systems. As these technologies evolve, selecting the right interaction method to control the system becomes crucial. Our work introduces a voice-controlled user interface (VCUI) for surgical AR assistance systems (ARAS), designed for pancreatic surgery, that integrates Large Language Models (LLMs). Employing a mixed-method research approach, we assessed the usability of our LLM-based design in both simulated surgical tasks and during pancreatic surgeries, comparing its performance against conventional VCUI for surgical ARAS using speech commands. Our findings demonstrated the usability of our proposed LLM-based VCUI, yielding a significantly lower task completion time and cognitive workload compared to speech commands. Additionally, qualitative insights from interviews with surgeons aligned with the quantitative data, revealing a strong preference for the LLM-based VCUI. Surgeons emphasized its intuitiveness and highlighted the potential of LLM-based VCUI in expediting decision-making in surgical environments.

LLMs Enable Context-Aware Augmented Reality in Surgical Navigation

TL;DR

The paper tackles interaction challenges in wearable AR for surgery by introducing a voice-controlled VCUI that leverages large language models. It compares a baseline speech-command VCUI to an LLM-based VCUI integrated with an AR navigation system (ARAS) designed for open pancreatic surgery, using simulated tasks and real surgeries. The LLM-based VCUI significantly reduces task completion time and cognitive workload, and surgeons express a strong preference for its natural, context-aware control, though refinements are needed for dictation reliability and feedback design. The findings highlight the potential of context-aware LLMs to enhance intraoperative decision support and suggest a hybrid interaction approach that combines the strengths of both speech commands and LLM-based control for surgical AR applications.

Abstract

Wearable Augmented Reality (AR) technologies are gaining recognition for their potential to transform surgical navigation systems. As these technologies evolve, selecting the right interaction method to control the system becomes crucial. Our work introduces a voice-controlled user interface (VCUI) for surgical AR assistance systems (ARAS), designed for pancreatic surgery, that integrates Large Language Models (LLMs). Employing a mixed-method research approach, we assessed the usability of our LLM-based design in both simulated surgical tasks and during pancreatic surgeries, comparing its performance against conventional VCUI for surgical ARAS using speech commands. Our findings demonstrated the usability of our proposed LLM-based VCUI, yielding a significantly lower task completion time and cognitive workload compared to speech commands. Additionally, qualitative insights from interviews with surgeons aligned with the quantitative data, revealing a strong preference for the LLM-based VCUI. Surgeons emphasized its intuitiveness and highlighted the potential of LLM-based VCUI in expediting decision-making in surgical environments.

Paper Structure

This paper contains 31 sections, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Chart showing different phases of our work from pre-design till the case study involving a clinical trial and associated data collection for each phase.
  • Figure 2: The characteristics of in-situ visualization and supportive data visualization feature set provided by the ARAS software.
  • Figure 3: A top view of the surgical room setup. The placement of the medical staff around the table. The limited area around the operation field, and sterilization rules restrict the interaction with AR surgical assistance system. The surgeons are numbered based on their role in the surgery with Number 1 being the lead surgeon.
  • Figure 4: Anatomical drawing of 3D reconstructed segments in ARAS and their positions around the pancreas. The names of these segments were used as keywords for VCUI using speech commands.
  • Figure 5: Overview of LLM-based VC framework. The system begins with loading patient files and function descriptions to generate an initial prompt. The system functions then is called upon the user's query via speech.
  • ...and 4 more figures