FRASA: An End-to-End Reinforcement Learning Agent for Fall Recovery and Stand Up of Humanoid Robots
Clément Gaspard, Marc Duclusaud, Grégoire Passault, Mélodie Daniel, Olivier Ly
TL;DR
FRASA presents a unified deep reinforcement learning agent for simultaneous fall recovery and stand up in humanoid robots, addressing the limitations of MPC and traditional key-framed approaches. The method uses CrossQ-enabled end-to-end learning in a symmetry-exploiting, sim-to-real framework with extensive domain randomization, achieving rapid training and robust performance on Sigmaban. Empirical results show FRASA outperforming a RoboCup Rhoban KFB baseline in both stand-up speed and disturbance rejection, while maintaining safe, adaptable behaviors. The work demonstrates practical impact by delivering a fast, transferable recovery-and-stand-up policy that reduces reliance on expert tuning and enables more resilient humanoid locomotion.
Abstract
Humanoid robotics faces significant challenges in achieving stable locomotion and recovering from falls in dynamic environments. Traditional methods, such as Model Predictive Control (MPC) and Key Frame Based (KFB) routines, either require extensive fine-tuning or lack real-time adaptability. This paper introduces FRASA, a Deep Reinforcement Learning (DRL) agent that integrates fall recovery and stand up strategies into a unified framework. Leveraging the Cross-Q algorithm, FRASA significantly reduces training time and offers a versatile recovery strategy that adapts to unpredictable disturbances. Comparative tests on Sigmaban humanoid robots demonstrate FRASA superior performance against the KFB method deployed in the RoboCup 2023 by the Rhoban Team, world champion of the KidSize League.
