SafeFall: Learning Protective Control for Humanoid Robots
Ziyu Meng, Tengyu Liu, Le Ma, Yingying Wu, Ran Song, Wei Zhang, Siyuan Huang
TL;DR
SafeFall addresses the problem of protecting full-scale humanoid robots from fall-related damage by combining a lightweight GRU fall predictor with a damage-aware reinforcement learning policy that activates only when a fall is unavoidable. The method collects diverse failure trajectories, trains a predictor with conservative labeling, and learns a protection strategy via PPO using a multi-term reward that encodes component vulnerability and actuator limits. Key innovations include a two-stage curriculum, an asymmetric actor-critic framework, and domain randomization to enable sim-to-real transfer, demonstrated on a Unitree G1 where peak joint torque, contact force, and vulnerable-component collisions are dramatically reduced. Real-world experiments corroborate simulation results, showing high specificity, modest lead times, and substantial reductions in impulse and damage during omnidirectional falls. Overall, SafeFall provides a practical safety net that preserves nominal performance while enabling more aggressive experimentation and faster deployment of humanoid robots in unstructured environments.
Abstract
Bipedal locomotion makes humanoid robots inherently prone to falls, causing catastrophic damage to the expensive sensors, actuators, and structural components of full-scale robots. To address this critical barrier to real-world deployment, we present \method, a framework that learns to predict imminent, unavoidable falls and execute protective maneuvers to minimize hardware damage. SafeFall is designed to operate seamlessly alongside existing nominal controller, ensuring no interference during normal operation. It combines two synergistic components: a lightweight, GRU-based fall predictor that continuously monitors the robot's state, and a reinforcement learning policy for damage mitigation. The protective policy remains dormant until the predictor identifies a fall as unavoidable, at which point it activates to take control and execute a damage-minimizing response. This policy is trained with a novel, damage-aware reward function that incorporates the robot's specific structural vulnerabilities, learning to shield critical components like the head and hands while absorbing energy with more robust parts of its body. Validated on a full-scale Unitree G1 humanoid, SafeFall demonstrated significant performance improvements over unprotected falls. It reduced peak contact forces by 68.3\%, peak joint torques by 78.4\%, and eliminated 99.3\% of collisions with vulnerable components. By enabling humanoids to fail safely, SafeFall provides a crucial safety net that allows for more aggressive experiments and accelerates the deployment of these robots in complex, real-world environments.
