Table of Contents
Fetching ...

Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning

Jonah Siekmann, Kevin Green, John Warila, Alan Fern, Jonathan Hurst

TL;DR

This work demonstrates that a blind bipedal robot (Cassie) can robustly traverse stair-like terrain using sim-to-real reinforcement learning with proprioceptive feedback alone. By adding stair-like terrain randomization to an otherwise flat-ground RL framework, the authors train memory-enabled policies (LSTM) that handle unknown stairs without exteroceptive sensing. The results show strong simulation performance and substantial real-world viability, with notable insights into swing-foot behavior and ground reaction forces that underlie robust disturbance rejection. The study also reveals energy-efficiency trade-offs and confirms the practical potential of proprioception-driven stair traversal in real environments, suggesting future integration with vision for efficiency gains.

Abstract

Accurate and precise terrain estimation is a difficult problem for robot locomotion in real-world environments. Thus, it is useful to have systems that do not depend on accurate estimation to the point of fragility. In this paper, we explore the limits of such an approach by investigating the problem of traversing stair-like terrain without any external perception or terrain models on a bipedal robot. For such blind bipedal platforms, the problem appears difficult (even for humans) due to the surprise elevation changes. Our main contribution is to show that sim-to-real reinforcement learning (RL) can achieve robust locomotion over stair-like terrain on the bipedal robot Cassie using only proprioceptive feedback. Importantly, this only requires modifying an existing flat-terrain training RL framework to include stair-like terrain randomization, without any changes in reward function. To our knowledge, this is the first controller for a bipedal, human-scale robot capable of reliably traversing a variety of real-world stairs and other stair-like disturbances using only proprioception.

Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning

TL;DR

This work demonstrates that a blind bipedal robot (Cassie) can robustly traverse stair-like terrain using sim-to-real reinforcement learning with proprioceptive feedback alone. By adding stair-like terrain randomization to an otherwise flat-ground RL framework, the authors train memory-enabled policies (LSTM) that handle unknown stairs without exteroceptive sensing. The results show strong simulation performance and substantial real-world viability, with notable insights into swing-foot behavior and ground reaction forces that underlie robust disturbance rejection. The study also reveals energy-efficiency trade-offs and confirms the practical potential of proprioception-driven stair traversal in real environments, suggesting future integration with vision for efficiency gains.

Abstract

Accurate and precise terrain estimation is a difficult problem for robot locomotion in real-world environments. Thus, it is useful to have systems that do not depend on accurate estimation to the point of fragility. In this paper, we explore the limits of such an approach by investigating the problem of traversing stair-like terrain without any external perception or terrain models on a bipedal robot. For such blind bipedal platforms, the problem appears difficult (even for humans) due to the surprise elevation changes. Our main contribution is to show that sim-to-real reinforcement learning (RL) can achieve robust locomotion over stair-like terrain on the bipedal robot Cassie using only proprioceptive feedback. Importantly, this only requires modifying an existing flat-terrain training RL framework to include stair-like terrain randomization, without any changes in reward function. To our knowledge, this is the first controller for a bipedal, human-scale robot capable of reliably traversing a variety of real-world stairs and other stair-like disturbances using only proprioception.

Paper Structure

This paper contains 18 sections, 6 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: In this work, we investigate the limits of blind bipedal locomotion. We present a training pipeline which produces policies capable of blindly ascending and descending stairs in the real world. These policies learn proprioceptive reflexes to reject significant disturbances in ground height, resulting in highly robust behavior to many real-world environments.
  • Figure 2: In order to ensure robustness over a variety of possible stair-like terrain, we randomize a number of parameters when generating stairs at the start of each episode in simulation. These parameters include the number of stairs, the height of each stair, the length of each stair, the length of the landing atop the stairs, and the slope of the ground immediately before and after the stairs.
  • Figure 3: The learned policies exhibit a high degree of blind robustness to a variety of stair-like terrain, and can reliably ascend and descend stairs of typical dimensions found in human environments.
  • Figure 4: We evaluate the probability of successfully climbing and descending stairs without falling as a function of commanded speed between 0.25 m/s and 1.5 m/s over 150 trials. For Stair LSTM policies, there seems to be an optimal approach speed for climbing stairs and a separate optimal descent speed. Stair FF policies do not attain high performance, implying that memory could be an important component of the learned behavior. Flat Ground LSTM policies, having never encountered stairs in training, are virtually unable to climb stairs while finding some success in descending stairs without falling over.
  • Figure 5: A comparison of the swing foot motion of the Stair LSTM policy and the Flat Ground LSTM policy while locomoting at 1.0 m/s. There is a significant change in the leg swing policy as a result of training on randomized stairs. The most significant changes are higher foot clearance, a steeper foot descent and a faster leg angle retraction rate.
  • ...and 2 more figures