Table of Contents
Fetching ...

Robot Crash Course: Learning Soft and Stylized Falling

Pascal Strauch, David Müller, Sammy Christen, Agon Serifi, Ruben Grandia, Espen Knoop, Moritz Bächer

TL;DR

To address fall risk in legged robots, this work enables controlled, soft falls with end-pose control by optimizing a robot-agnostic reward that balances end-pose accuracy $g$ and impact minimization $r_t$. It introduces a sampling-based end-pose generation strategy to cover diverse initial and final states and trains a policy via PPO with an asymmetric actor–critic setup in Isaac Sim for sim-to-real transfer. The method achieves softer impacts than standard falling strategies and demonstrates successful real-world falls to artist-specified end poses, including robustness to perturbations. These findings open avenues for safe fall handling, artistic stylization, and recovery-ready poses in legged robotics.

Abstract

Despite recent advances in robust locomotion, bipedal robots operating in the real world remain at risk of falling. While most research focuses on preventing such events, we instead concentrate on the phenomenon of falling itself. Specifically, we aim to reduce physical damage to the robot while providing users with control over a robot's end pose. To this end, we propose a robot agnostic reward function that balances the achievement of a desired end pose with impact minimization and the protection of critical robot parts during reinforcement learning. To make the policy robust to a broad range of initial falling conditions and to enable the specification of an arbitrary and unseen end pose at inference time, we introduce a simulation-based sampling strategy of initial and end poses. Through simulated and real-world experiments, our work demonstrates that even bipedal robots can perform controlled, soft falls.

Robot Crash Course: Learning Soft and Stylized Falling

TL;DR

To address fall risk in legged robots, this work enables controlled, soft falls with end-pose control by optimizing a robot-agnostic reward that balances end-pose accuracy and impact minimization . It introduces a sampling-based end-pose generation strategy to cover diverse initial and final states and trains a policy via PPO with an asymmetric actor–critic setup in Isaac Sim for sim-to-real transfer. The method achieves softer impacts than standard falling strategies and demonstrates successful real-world falls to artist-specified end poses, including robustness to perturbations. These findings open avenues for safe fall handling, artistic stylization, and recovery-ready poses in legged robotics.

Abstract

Despite recent advances in robust locomotion, bipedal robots operating in the real world remain at risk of falling. While most research focuses on preventing such events, we instead concentrate on the phenomenon of falling itself. Specifically, we aim to reduce physical damage to the robot while providing users with control over a robot's end pose. To this end, we propose a robot agnostic reward function that balances the achievement of a desired end pose with impact minimization and the protection of critical robot parts during reinforcement learning. To make the policy robust to a broad range of initial falling conditions and to enable the specification of an arbitrary and unseen end pose at inference time, we introduce a simulation-based sampling strategy of initial and end poses. Through simulated and real-world experiments, our work demonstrates that even bipedal robots can perform controlled, soft falls.

Paper Structure

This paper contains 25 sections, 4 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: We propose a reinforcement learning technique that balances user-guided stylized pose objectives and damage-minimizing soft falling objectives for bipedal and other legged robots.
  • Figure 2: Method Overview. We leverage reinforcement learning to train a robust falling policy (right). Our method learns to balance impact minimization with reaching a desired end pose through our reward formulation, which considers user-specified robot part sensitivities. During inference (left), the policy is guided by a user-specified end pose, while simultaneously minimizing impact.
  • Figure 3: Artist-Designed End Poses. Visual examples of the 10 artist-designed end poses used in our experiments.
  • Figure 4: Impact Analysis. Comparison of maximal (left) and mean (right) impact forces across body parts between standard falling strategies and our method.
  • Figure 5: Impact vs. Tracking Ablation. We measure the max impact force and mean joint tracking error for varying impact reward weights. Displayed are the mean values over all trials.
  • ...and 2 more figures