Table of Contents
Fetching ...

Discovering Self-Protective Falling Policy for Humanoid Robot via Deep Reinforcement Learning

Diyuan Shi, Shangke Lyu, Donglin Wang

TL;DR

The paper tackles reducing falling damages for humanoid robots by letting a deep reinforcement learning agent self-discover protective behaviors through curriculum learning and domain randomization. Training in thousands of parallel sim environments (IsaacGym) yields an emergent triangle-brace policy that distributes impact and lowers critical-body loads, with successful transfer to a real Unitree G1. A comprehensive experimental framework with diverse fall scenarios and metrics shows the approach outperforms baselines in reducing motion energy and contact forces, while validating real-world applicability. The work demonstrates that robot-specific protective strategies can be learned without heavy human priors, enabling safer operation of high-DoF humanoids in dynamic settings.

Abstract

Humanoid robots have received significant research interests and advancements in recent years. Despite many successes, due to their morphology, dynamics and limitation of control policy, humanoid robots are prone to fall as compared to other embodiments like quadruped or wheeled robots. And its large weight, tall Center of Mass, high Degree-of-Freedom would cause serious hardware damages when falling uncontrolled, to both itself and surrounding objects. Existing researches in this field mostly focus on using control based methods that struggle to cater diverse falling scenarios and may introduce unsuitable human prior. On the other hand, large-scale Deep Reinforcement Learning and Curriculum Learning could be employed to incentivize humanoid agent discovering falling protection policy that fits its own nature and property. In this work, with carefully designed reward functions and domain diversification curriculum, we successfully train humanoid agent to explore falling protection behaviors and discover that by forming a `triangle' structure, the falling damages could be significantly reduced with its rigid-material body. With comprehensive metrics and experiments, we quantify its performance with comparison to other methods, visualize its falling behaviors and successfully transfer it to real world platform.

Discovering Self-Protective Falling Policy for Humanoid Robot via Deep Reinforcement Learning

TL;DR

The paper tackles reducing falling damages for humanoid robots by letting a deep reinforcement learning agent self-discover protective behaviors through curriculum learning and domain randomization. Training in thousands of parallel sim environments (IsaacGym) yields an emergent triangle-brace policy that distributes impact and lowers critical-body loads, with successful transfer to a real Unitree G1. A comprehensive experimental framework with diverse fall scenarios and metrics shows the approach outperforms baselines in reducing motion energy and contact forces, while validating real-world applicability. The work demonstrates that robot-specific protective strategies can be learned without heavy human priors, enabling safer operation of high-DoF humanoids in dynamic settings.

Abstract

Humanoid robots have received significant research interests and advancements in recent years. Despite many successes, due to their morphology, dynamics and limitation of control policy, humanoid robots are prone to fall as compared to other embodiments like quadruped or wheeled robots. And its large weight, tall Center of Mass, high Degree-of-Freedom would cause serious hardware damages when falling uncontrolled, to both itself and surrounding objects. Existing researches in this field mostly focus on using control based methods that struggle to cater diverse falling scenarios and may introduce unsuitable human prior. On the other hand, large-scale Deep Reinforcement Learning and Curriculum Learning could be employed to incentivize humanoid agent discovering falling protection policy that fits its own nature and property. In this work, with carefully designed reward functions and domain diversification curriculum, we successfully train humanoid agent to explore falling protection behaviors and discover that by forming a `triangle' structure, the falling damages could be significantly reduced with its rigid-material body. With comprehensive metrics and experiments, we quantify its performance with comparison to other methods, visualize its falling behaviors and successfully transfer it to real world platform.

Paper Structure

This paper contains 16 sections, 2 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: An illustrative diagram of falling protection in human being and humanoid robot. The leftmost one depicts human's behavior and the right 2 figures are behaviors of our humanoid robot falling backward and forward, respectively. The red curves emphasize the 'triangle' structure learned by our robot.
  • Figure 2: An illustrative diagram of our learning framework.
  • Figure 3: Illustration of some behaviors unlikely to be transfered into real world deployment.
  • Figure 4: Illustration of our testing environments. The white arrow represents pushing direction. The colorful body represents pushed body or failed actuation.
  • Figure 5: The aggregated falling damages under various falling scenarios.
  • ...and 6 more figures