How Can Large Language Models Enable Better Socially Assistive Human-Robot Interaction: A Brief Survey
Zhonghao Shi, Ellen Landrum, Amy O' Connell, Mina Kian, Leticia Pinto-Alva, Kaleen Shrestha, Xiaoyuan Zhu, Maja J Matarić
TL;DR
This survey investigates how large language models can enhance socially assistive human-robot interaction by addressing three core challenges: natural language dialogue, multimodal understanding, and LLM-driven robot policies. It synthesizes evidence that LLMs improve dialog quality, enable flexible multimodal reasoning via vision-language models, and permit more natural robot policies, while also highlighting risks such as hallucinations, latency, bias, and privacy concerns. The paper discusses potential applications ranging from motivational coaching to educational support, and outlines key research directions and safety considerations for responsible deployment. Overall, it provides a roadmap for integrating LLMs into SARs to deliver more personalized, context-aware, and scalable social assistance.
Abstract
Socially assistive robots (SARs) have shown great success in providing personalized cognitive-affective support for user populations with special needs such as older adults, children with autism spectrum disorder (ASD), and individuals with mental health challenges. The large body of work on SAR demonstrates its potential to provide at-home support that complements clinic-based interventions delivered by mental health professionals, making these interventions more effective and accessible. However, there are still several major technical challenges that hinder SAR-mediated interactions and interventions from reaching human-level social intelligence and efficacy. With the recent advances in large language models (LLMs), there is an increased potential for novel applications within the field of SAR that can significantly expand the current capabilities of SARs. However, incorporating LLMs introduces new risks and ethical concerns that have not yet been encountered, and must be carefully be addressed to safely deploy these more advanced systems. In this work, we aim to conduct a brief survey on the use of LLMs in SAR technologies, and discuss the potentials and risks of applying LLMs to the following three major technical challenges of SAR: 1) natural language dialog; 2) multimodal understanding; 3) LLMs as robot policies.
