AcL: Action Learner for Fault-Tolerant Quadruped Locomotion Control
Tianyu Xu, Yaoyu Cheng, Pinxi Shen, Lin Zhao
TL;DR
This work tackles fault-tolerant quadruped locomotion under multiple joint faults by introducing Action Learner (AcL), a teacher-student reinforcement learning framework. It trains multiple fault-specific teacher policies and distills their guidance into a single encoder–decoder student policy, using style rewards derived from teacher actions and regularization rewards to ensure robust, smooth gait transitions. The encoder identifies fault conditions from history, enabling autonomous switching between normal and limping gaits, while the decoder generates actions; the approach supports up to four faulty joints and maintains stability under disturbances. Real-world tests on a Unitree Go2 validate fault-tolerant walking, seamless gait transitions, and resilience to external perturbations, demonstrating practical viability and potential applicability to broader terrain-adaptive tasks.
Abstract
Quadrupedal robots can learn versatile locomotion skills but remain vulnerable when one or more joints lose power. In contrast, dogs and cats can adopt limping gaits when injured, demonstrating their remarkable ability to adapt to physical conditions. Inspired by such adaptability, this paper presents Action Learner (AcL), a novel teacher-student reinforcement learning framework that enables quadrupeds to autonomously adapt their gait for stable walking under multiple joint faults. Unlike conventional teacher-student approaches that enforce strict imitation, AcL leverages teacher policies to generate style rewards, guiding the student policy without requiring precise replication. We train multiple teacher policies, each corresponding to a different fault condition, and subsequently distill them into a single student policy with an encoder-decoder architecture. While prior works primarily address single-joint faults, AcL enables quadrupeds to walk with up to four faulty joints across one or two legs, autonomously switching between different limping gaits when faults occur. We validate AcL on a real Go2 quadruped robot under single- and double-joint faults, demonstrating fault-tolerant, stable walking, smooth gait transitions between normal and lamb gaits, and robustness against external disturbances.
