Socially-Aware Robot Navigation Enhanced by Bidirectional Natural Language Conversations Using Large Language Models

Congcong Wen; Yifan Liu; Geeta Chandra Raju Bethala; Shuaihang Yuan; Hao Huang; Yu Hao; Mengyu Wang; Yu-Shen Liu; Anthony Tzes; Yi Fang

Socially-Aware Robot Navigation Enhanced by Bidirectional Natural Language Conversations Using Large Language Models

Congcong Wen, Yifan Liu, Geeta Chandra Raju Bethala, Shuaihang Yuan, Hao Huang, Yu Hao, Mengyu Wang, Yu-Shen Liu, Anthony Tzes, Yi Fang

TL;DR

The paper tackles socially-aware robot navigation by enabling bidirectional language-enabled interactions during movement. It introduces HSAC-LLM, a Hybrid Soft Actor-Critic policy that handles continuous speeds $[v,\omega]$ and discrete interaction actions $a_c$, augmented by Lang_GenNet for natural-language communication via an LLM. The architecture includes PreNet modules to fuse sensor data and conversation cues, and a tailored reward function combining safety, efficiency, and interaction factors. Across 2D simulations, Gazebo, and real-world tests, HSAC-LLM outperforms standard DRL baselines in success, collisions, and path efficiency, demonstrating practical viability for harmonious human-robot collaboration.

Abstract

Robot navigation is crucial across various domains, yet traditional methods focus on efficiency and obstacle avoidance, often overlooking human behavior in shared spaces. With the rise of service robots, socially aware navigation has gained prominence. However, existing approaches primarily predict pedestrian movements or issue alerts, lacking true human-robot interaction. We introduce Hybrid Soft Actor-Critic with Large Language Model (HSAC-LLM), a novel framework for socially aware navigation. By integrating deep reinforcement learning with large language models, HSAC-LLM enables bidirectional natural language interactions, predicting both continuous and discrete navigation actions. When potential collisions arise, the robot proactively communicates with pedestrians to determine avoidance strategies. Experiments in 2D simulation, Gazebo, and real-world environments demonstrate that HSAC-LLM outperforms state-of-the-art DRL methods in interaction, navigation, and obstacle avoidance. This paradigm advances effective human-robot interactions in dynamic settings. Videos are available at https://hsacllm.github.io/.

Socially-Aware Robot Navigation Enhanced by Bidirectional Natural Language Conversations Using Large Language Models

TL;DR

and discrete interaction actions

, augmented by Lang_GenNet for natural-language communication via an LLM. The architecture includes PreNet modules to fuse sensor data and conversation cues, and a tailored reward function combining safety, efficiency, and interaction factors. Across 2D simulations, Gazebo, and real-world tests, HSAC-LLM outperforms standard DRL baselines in success, collisions, and path efficiency, demonstrating practical viability for harmonious human-robot collaboration.

Abstract

Paper Structure (27 sections, 6 equations, 5 figures, 3 tables)

This paper contains 27 sections, 6 equations, 5 figures, 3 tables.

INTRODUCTION
RELATED WORK
Robot Navigation
Human-Robot Interaction in Robot Navigation
Discrete-Continuous Hybrid Action Space
Methods
Overview
PreNet
Img_PreNet
Lang_PreNet
State_PreNet
HSAC-LLM
HSAC Module
Lang_GenNet Module
Reward Function
...and 12 more sections

Figures (5)

Figure 1: Strategies used to avoid incoming collision with pedestrians. Left: Detour for a longer route to prevent collision yao2021crowd. Middle: Beeping to alert pedestrians and create space for the robot 9341519. Right: Interactive voice interaction, enabling natural language conversations between robots and pedestrians (our approach).
Figure 2: Illustration of a typical environment and the architecture of the proposed HSAC-LLM model.
Figure 3: Visualization of navigation trajectories in the Square, Hallway, and Crosswalk scenarios for baseline models (top) and our HSAC-LLM model (bottom).
Figure 4: Interaction Scenarios in Hallway and Crosswalk Environments (Robot Proactive and Pedestrian Proactive cases).
Figure 5: Real-world hallway experiments showing Robot Proactive (top) and Human Proactive (bottom) scenarios, demonstrates how natural-language communication enables effective collision avoidance between robot and pedestrian.

Socially-Aware Robot Navigation Enhanced by Bidirectional Natural Language Conversations Using Large Language Models

TL;DR

Abstract

Socially-Aware Robot Navigation Enhanced by Bidirectional Natural Language Conversations Using Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (5)