VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots
Akhil Padmanabha, Jessie Yuan, Janavi Gupta, Zulekha Karachiwalla, Carmel Majidi, Henny Admoni, Zackory Erickson
TL;DR
This work tackles the design of LLM-based speech interfaces for physically assistive robots to support independence in people with motor impairments. It proposes an iterative three-version framework that integrates an off-the-shelf LLM with the Obi robot for feeding tasks and validates the approach through a user study with 11 older adults, complemented by qualitative and quantitative analyses. The key contributions include a final nine-component framework, five user-centered design guidelines (Customization, Multi-Step Instruction, Consistency, Comparable Time to Caregiver, Social Capability), and practical insights to guide researchers and designers in deploying LLMs for assistive robotics. The study demonstrates the potential of combining prompt/system engineering with human-centered evaluation to create usable, safe, and adaptable speech interfaces for robot-assisted care, with implications for independence and quality of life.
Abstract
Physically assistive robots present an opportunity to significantly increase the well-being and independence of individuals with motor impairments or other forms of disability who are unable to complete activities of daily living. Speech interfaces, especially ones that utilize Large Language Models (LLMs), can enable individuals to effectively and naturally communicate high-level commands and nuanced preferences to robots. Frameworks for integrating LLMs as interfaces to robots for high level task planning and code generation have been proposed, but fail to incorporate human-centric considerations which are essential while developing assistive interfaces. In this work, we present a framework for incorporating LLMs as speech interfaces for physically assistive robots, constructed iteratively with 3 stages of testing involving a feeding robot, culminating in an evaluation with 11 older adults at an independent living facility. We use both quantitative and qualitative data from the final study to validate our framework and additionally provide design guidelines for using LLMs as speech interfaces for assistive robots. Videos and supporting files are located on our project website: https://sites.google.com/andrew.cmu.edu/voicepilot/
