Table of Contents
Fetching ...

Learning Locomotion on Complex Terrain for Quadrupedal Robots with Foot Position Maps and Stability Rewards

Matthew Hwang, Yubin Liu, Ryo Hakoda, Takeshi Oishi

Abstract

Quadrupedal locomotion over complex terrain has been a long-standing research topic in robotics. While recent reinforcement learning-based locomotion methods improve generalizability and foot-placement precision, they rely on implicit inference of foot positions from joint angles, lacking the explicit precision and stability guarantees of optimization-based approaches. To address this, we introduce a foot position map integrated into the heightmap, and a dynamic locomotion-stability reward within an attention-based framework to achieve locomotion on complex terrain. We validate our method extensively on terrains seen during training as well as out-of-domain (OOD) terrains. Our results demonstrate that the proposed method enables precise and stable movement, resulting in improved locomotion success rates on both in-domain and OOD terrains.

Learning Locomotion on Complex Terrain for Quadrupedal Robots with Foot Position Maps and Stability Rewards

Abstract

Quadrupedal locomotion over complex terrain has been a long-standing research topic in robotics. While recent reinforcement learning-based locomotion methods improve generalizability and foot-placement precision, they rely on implicit inference of foot positions from joint angles, lacking the explicit precision and stability guarantees of optimization-based approaches. To address this, we introduce a foot position map integrated into the heightmap, and a dynamic locomotion-stability reward within an attention-based framework to achieve locomotion on complex terrain. We validate our method extensively on terrains seen during training as well as out-of-domain (OOD) terrains. Our results demonstrate that the proposed method enables precise and stable movement, resulting in improved locomotion success rates on both in-domain and OOD terrains.

Paper Structure

This paper contains 33 sections, 11 equations, 10 figures, 8 tables.

Figures (10)

  • Figure A1: Overview of our method. With the proposed foot position map and stability reward, our policy achieves successful locomotion over complex terrain.
  • Figure A2: We propose a foot position map which is concatenated with the heightmap providing information of the foot positions relative to the terrain. During training, we add a stability reward based on the CoP to guide the policy to prefer safer actions.
  • Figure A3: CoP-based stability rewards. We calculate the minimum distance of the Center of Pressure (CoP) to boundary of the support polygon as the stability reward.
  • Figure D1: The training environment consists of smooth, rough, stairs up, stairs down, discrete and stones terrain. The OOD evaluation terrains include novel combinations of stones (S), rough (R), stairsup (SU) and stairsdown (SD) terrains as well as beams, pallets, circles, small stones, pits and gaps.
  • Figure D2: The success rate results. (a) Our proposed method produces higher success rates across the board for all terrains. (b) We observe an improvement at higher difficulty levels. (c) We observe a higher success rate especially at lower velocities.
  • ...and 5 more figures