Learning Vision-Based Bipedal Locomotion for Challenging Terrain
Helei Duan, Bikram Pandit, Mohitvishnu S. Gadde, Bart van Marum, Jeremy Dao, Chanho Kim, Alan Fern
TL;DR
This work presents a fully learned vision-based bipedal locomotion framework that operates using a local heightmap and proprioception to react to challenging terrain without relying on global odometry. It couples a terrain-aware control policy with a depth-based heightmap predictor trained entirely in simulation and transferred to real hardware via extensive domain randomization. The two-component system enables sim-to-real transfer for complex terrains on Cassie, achieving robust locomotion over blocks, stairs, and treadmill scenarios without pose estimation. The approach demonstrates the feasibility and importance of end-to-end learned perception and control for agile, vision-guided bipedal locomotion in real-world environments.
Abstract
Reinforcement learning (RL) for bipedal locomotion has recently demonstrated robust gaits over moderate terrains using only proprioceptive sensing. However, such blind controllers will fail in environments where robots must anticipate and adapt to local terrain, which requires visual perception. In this paper, we propose a fully-learned system that allows bipedal robots to react to local terrain while maintaining commanded travel speed and direction. Our approach first trains a controller in simulation using a heightmap expressed in the robot's local frame. Next, data is collected in simulation to train a heightmap predictor, whose input is the history of depth images and robot states. We demonstrate that with appropriate domain randomization, this approach allows for successful sim-to-real transfer with no explicit pose estimation and no fine-tuning using real-world data. To the best of our knowledge, this is the first example of sim-to-real learning for vision-based bipedal locomotion over challenging terrains.
