Table of Contents
Fetching ...

EmoBipedNav: Emotion-aware Social Navigation for Bipedal Robots with Deep Reinforcement Learning

Wei Zhu, Abirath Raju, Abdulaziz Shamsah, Anqi Wu, Seth Hutchinson, Ye Zhao

TL;DR

EmoBipedNav proposes an emotion-aware, end-to-end DRL framework for socially navigating bipedal robots. By representing environments with sequential LiDAR grid maps that encode collision regions and emotion-driven discomfort zones, and by training directly with full-body robot dynamics, the approach bridges ROM-controller gaps and achieves robust, socially compliant navigation. Across simulations and hardware demos, EmoBipedNav with time-varying emotions demonstrates superior success rates and effective handling of pedestrian emotions, while LGMs outperform traditional occupancy grids. The work advances practical, emotion-sensitive social navigation for real-world bipedal robots and highlights sim-to-real transfer capabilities.

Abstract

This study presents an emotion-aware navigation framework -- EmoBipedNav -- using deep reinforcement learning (DRL) for bipedal robots walking in socially interactive environments. The inherent locomotion constraints of bipedal robots challenge their safe maneuvering capabilities in dynamic environments. When combined with the intricacies of social environments, including pedestrian interactions and social cues, such as emotions, these challenges become even more pronounced. To address these coupled problems, we propose a two-stage pipeline that considers both bipedal locomotion constraints and complex social environments. Specifically, social navigation scenarios are represented using sequential LiDAR grid maps (LGMs), from which we extract latent features, including collision regions, emotion-related discomfort zones, social interactions, and the spatio-temporal dynamics of evolving environments. The extracted features are directly mapped to the actions of reduced-order models (ROMs) through a DRL architecture. Furthermore, the proposed framework incorporates full-order dynamics and locomotion constraints during training, effectively accounting for tracking errors and restrictions of the locomotion controller while planning the trajectory with ROMs. Comprehensive experiments demonstrate that our approach exceeds both model-based planners and DRL-based baselines. The hardware videos and open-source code are available at https://gatech-lidar.github.io/emobipednav.github.io/.

EmoBipedNav: Emotion-aware Social Navigation for Bipedal Robots with Deep Reinforcement Learning

TL;DR

EmoBipedNav proposes an emotion-aware, end-to-end DRL framework for socially navigating bipedal robots. By representing environments with sequential LiDAR grid maps that encode collision regions and emotion-driven discomfort zones, and by training directly with full-body robot dynamics, the approach bridges ROM-controller gaps and achieves robust, socially compliant navigation. Across simulations and hardware demos, EmoBipedNav with time-varying emotions demonstrates superior success rates and effective handling of pedestrian emotions, while LGMs outperform traditional occupancy grids. The work advances practical, emotion-sensitive social navigation for real-world bipedal robots and highlights sim-to-real transfer capabilities.

Abstract

This study presents an emotion-aware navigation framework -- EmoBipedNav -- using deep reinforcement learning (DRL) for bipedal robots walking in socially interactive environments. The inherent locomotion constraints of bipedal robots challenge their safe maneuvering capabilities in dynamic environments. When combined with the intricacies of social environments, including pedestrian interactions and social cues, such as emotions, these challenges become even more pronounced. To address these coupled problems, we propose a two-stage pipeline that considers both bipedal locomotion constraints and complex social environments. Specifically, social navigation scenarios are represented using sequential LiDAR grid maps (LGMs), from which we extract latent features, including collision regions, emotion-related discomfort zones, social interactions, and the spatio-temporal dynamics of evolving environments. The extracted features are directly mapped to the actions of reduced-order models (ROMs) through a DRL architecture. Furthermore, the proposed framework incorporates full-order dynamics and locomotion constraints during training, effectively accounting for tracking errors and restrictions of the locomotion controller while planning the trajectory with ROMs. Comprehensive experiments demonstrate that our approach exceeds both model-based planners and DRL-based baselines. The hardware videos and open-source code are available at https://gatech-lidar.github.io/emobipednav.github.io/.

Paper Structure

This paper contains 17 sections, 9 equations, 10 figures, 3 tables, 1 algorithm.

Figures (10)

  • Figure 1: Emotion-aware social navigation with the bipedal robot Digit. Digit is required to maintain customized comfort distances from pedestrians with specific emotions while navigating toward a designated goal location. Note that, the comfort distances vary according to different pedestrian emotions.
  • Figure 2: Overview of the proposed EmoBipedNav framework using the bipedal robot Digit. Our framework begins by obtaining estimated facial emotions (Fig. (e)) using pre-trained CNN models. Simultaneously, we transform raw LiDAR scans (Fig. (f)) into sequential pie-shape LGMs (Fig. (a)), where the red cells denote collision areas and the blue patches highlight discomfort zones associated with pedestrian emotions. These grid maps are converted into stacked pixel images which are further processed through an encoder constructed using CNNs to extract socially interactive and emotionally aware features. The resulting latent features are concatenated with the robot's last command and target position (Fig. (b)), which are fed into an actor-critic DRL structure implemented with multi-layer perceptrons (MLPs). The action output from the actor network is derived from the ROM (Fig. (c)) and practically applied to a bipedal robot with full-body dynamics and constraints (Fig. (d)). The torso position and yaw angle obtained from Digit correspond to the ego-agent state in Fig. (f). We use the Angular momentum LIP planner (ALIP) gong2021one and a passivity full-body controller with ankle actuation ShamsahIntegrated to track the desirable ROM trajectory.
  • Figure 3: LiDAR scan and its corresponding grid maps. (a) illustrates the raw LiDAR scan. The red solid circle with an arrow represents the ego-agent and its orientation in the world frame, and the green solid one is the goal. Moving pedestrians are blue hollow circles, and static obstacles are depicted as solid cyan circles. The outer dashed circles illustrate the discomfort margins. (b) and (c) are the corresponding LiDAR grid map and occupation grid map, respectively. X and Y axes in (b) represent rotational and radial indices, while X and Y in (c) are horizontal and vertical pixel positions, respectively. (b-1) and (b-2) shown in (b) are the margins corresponding to two close objects surrounding the ego-agent illustrated in (a). (c-1) and (c-2) correspond to the same objects as (b-1) and (b-2), respectively. Their zoomed-in views are shown on the top row.
  • Figure 4: Geometry representations of reward-related notations.
  • Figure 5: Tracking performance. The top figure shows the $x$ tracking, the middle plots the $y$ tracking, and the bottom illustrates the heading angle following in the world frame.
  • ...and 5 more figures