LLMs Enable Context-Aware Augmented Reality in Surgical Navigation
Hamraz Javaheri, Omid Ghamarnejad, Paul Lukowicz, Gregor Alexander Stavrou, Jakob Karolus
TL;DR
The paper tackles interaction challenges in wearable AR for surgery by introducing a voice-controlled VCUI that leverages large language models. It compares a baseline speech-command VCUI to an LLM-based VCUI integrated with an AR navigation system (ARAS) designed for open pancreatic surgery, using simulated tasks and real surgeries. The LLM-based VCUI significantly reduces task completion time and cognitive workload, and surgeons express a strong preference for its natural, context-aware control, though refinements are needed for dictation reliability and feedback design. The findings highlight the potential of context-aware LLMs to enhance intraoperative decision support and suggest a hybrid interaction approach that combines the strengths of both speech commands and LLM-based control for surgical AR applications.
Abstract
Wearable Augmented Reality (AR) technologies are gaining recognition for their potential to transform surgical navigation systems. As these technologies evolve, selecting the right interaction method to control the system becomes crucial. Our work introduces a voice-controlled user interface (VCUI) for surgical AR assistance systems (ARAS), designed for pancreatic surgery, that integrates Large Language Models (LLMs). Employing a mixed-method research approach, we assessed the usability of our LLM-based design in both simulated surgical tasks and during pancreatic surgeries, comparing its performance against conventional VCUI for surgical ARAS using speech commands. Our findings demonstrated the usability of our proposed LLM-based VCUI, yielding a significantly lower task completion time and cognitive workload compared to speech commands. Additionally, qualitative insights from interviews with surgeons aligned with the quantitative data, revealing a strong preference for the LLM-based VCUI. Surgeons emphasized its intuitiveness and highlighted the potential of LLM-based VCUI in expediting decision-making in surgical environments.
