HiFAR: Multi-Stage Curriculum Learning for High-Dynamics Humanoid Fall Recovery
Penghui Chen, Yushi Wang, Changsheng Luo, Wenhan Cai, Mingguo Zhao
TL;DR
HiFAR introduces a two‑stage curriculum that grows fall‑recovery tasks from planar 2D to full 3D, enabling a high‑dynamics policy capable of recovering from diverse falls. The approach combines reference/state initialization, reward shaping, domain randomization, and cross‑stage network expansion to achieve deployable performance with sim‑to‑real transfer on Booster T1. Experimental validation includes extensive simulation and real‑robot tests demonstrating high success rates, fast recovery times, robustness to disturbances and loads, and generalization across initial states and environmental conditions. This work advances autonomous fall recovery for humanoids in dynamic, unstructured environments and offers a practical pathway for broad real‑world deployment.
Abstract
Humanoid robots encounter considerable difficulties in autonomously recovering from falls, especially within dynamic and unstructured environments. Conventional control methodologies are often inadequate in addressing the complexities associated with high-dimensional dynamics and the contact-rich nature of fall recovery. Meanwhile, reinforcement learning techniques are hindered by issues related to sparse rewards, intricate collision scenarios, and discrepancies between simulation and real-world applications. In this study, we introduce a multi-stage curriculum learning framework, termed HiFAR. This framework employs a staged learning approach that progressively incorporates increasingly complex and high-dimensional recovery tasks, thereby facilitating the robot's acquisition of efficient and stable fall recovery strategies. Furthermore, it enables the robot to adapt its policy to effectively manage real-world fall incidents. We assess the efficacy of the proposed method using a real humanoid robot, showcasing its capability to autonomously recover from a diverse range of falls with high success rates, rapid recovery times, robustness, and generalization.
