Table of Contents
Fetching ...

TRANS: Terrain-aware Reinforcement Learning for Agile Navigation of Quadruped Robots under Social Interactions

Wei Zhu, Irfan Tito Kurniawan, Ye Zhao, Mistuhiro Hayashibe

TL;DR

This study introduces TRANS: Terrain-aware Reinforcement learning for Agile Navigation under Social interactions, a deep reinforcement learning (DRL) framework for quadrupedal social navigation over unstructured terrains, and proposes a two-stage training framework with three DRL pipelines.

Abstract

This study introduces TRANS: Terrain-aware Reinforcement learning for Agile Navigation under Social interactions, a deep reinforcement learning (DRL) framework for quadrupedal social navigation over unstructured terrains. Conventional quadrupedal navigation typically separates motion planning from locomotion control, neglecting whole-body constraints and terrain awareness. On the other hand, end-to-end methods are more integrated but require high-frequency sensing, which is often noisy and computationally costly. In addition, most existing approaches assume static environments, limiting their use in human-populated settings. To address these limitations, we propose a two-stage training framework with three DRL pipelines. (1) TRANS-Loco employs an asymmetric actor-critic (AC) model for quadrupedal locomotion, enabling traversal of uneven terrains without explicit terrain or contact observations. (2) TRANS-Nav applies a symmetric AC framework for social navigation, directly mapping transformed LiDAR data to ego-agent actions under differential-drive kinematics. (3) A unified pipeline, TRANS, integrates TRANS-Loco and TRANS-Nav, supporting terrain-aware quadrupedal navigation in uneven and socially interactive environments. Comprehensive benchmarks against locomotion and social navigation baselines demonstrate the effectiveness of TRANS. Hardware experiments further confirm its potential for sim-to-real transfer.

TRANS: Terrain-aware Reinforcement Learning for Agile Navigation of Quadruped Robots under Social Interactions

TL;DR

This study introduces TRANS: Terrain-aware Reinforcement learning for Agile Navigation under Social interactions, a deep reinforcement learning (DRL) framework for quadrupedal social navigation over unstructured terrains, and proposes a two-stage training framework with three DRL pipelines.

Abstract

This study introduces TRANS: Terrain-aware Reinforcement learning for Agile Navigation under Social interactions, a deep reinforcement learning (DRL) framework for quadrupedal social navigation over unstructured terrains. Conventional quadrupedal navigation typically separates motion planning from locomotion control, neglecting whole-body constraints and terrain awareness. On the other hand, end-to-end methods are more integrated but require high-frequency sensing, which is often noisy and computationally costly. In addition, most existing approaches assume static environments, limiting their use in human-populated settings. To address these limitations, we propose a two-stage training framework with three DRL pipelines. (1) TRANS-Loco employs an asymmetric actor-critic (AC) model for quadrupedal locomotion, enabling traversal of uneven terrains without explicit terrain or contact observations. (2) TRANS-Nav applies a symmetric AC framework for social navigation, directly mapping transformed LiDAR data to ego-agent actions under differential-drive kinematics. (3) A unified pipeline, TRANS, integrates TRANS-Loco and TRANS-Nav, supporting terrain-aware quadrupedal navigation in uneven and socially interactive environments. Comprehensive benchmarks against locomotion and social navigation baselines demonstrate the effectiveness of TRANS. Hardware experiments further confirm its potential for sim-to-real transfer.
Paper Structure (41 sections, 24 equations, 13 figures, 11 tables, 3 algorithms)

This paper contains 41 sections, 24 equations, 13 figures, 11 tables, 3 algorithms.

Figures (13)

  • Figure 1: Terrain-aware navigation of quadruped robots in socially interactive environments. The top panel illustrates simulation training scenarios, while the bottom panels show real-world implementations.
  • Figure 2: Overall framework with a two-stage training architecture. In the first stage, a quadrupedal locomotion policy and a social navigation policy are trained separately. In the second stage, these two policies are unified into a single quadrupedal navigation policy, which is further retrained in uneven and socially interactive environments.
  • Figure 3: Terrain height map and contact points visualized in IsaacSim.
  • Figure 4: LiDAR scan transformation. The red filled circle represents the ego-agent, blue filled circles denote static obstacles of varying sizes, and blue hollow circles indicate pedestrians with a fixed radius. The left panel shows the LiDAR scan (solid beams) at time $t-k$, while the right panel illustrates the scan (solid beams) at the current time $t$. Dashed lines in the right panel correspond to the transformed LiDAR beams from $t-k$ to $t$. Each point $P_i$ retains the same global position in both panels.
  • Figure 5: Reward structure and scenarios. The left panel illustrates the geometric representation used for defining the reward function, while the right panel depicts four possible ego-agent scenarios: collision, goal arrival, discomfort, and open space.
  • ...and 8 more figures