PALo: Learning Posture-Aware Locomotion for Quadruped Robots
Xiangyu Miao, Jun Sun, Hang Lai, Xinpeng Di, Jiahang Cao, Yong Yu, Weinan Zhang
TL;DR
PALo tackles posture-aware locomotion for quadruped robots by learning an end-to-end policy that jointly tracks 6D velocity/posture commands. It employs a partially observable MDP with an asymmetric actor-critic, augmented by Adversarial Motion Priors and a layered training curriculum, plus domain randomization to bridge the sim-to-real gap. Key contributions include integrating posture control into 6D command tracking, using AMP to simplify rewards, and demonstrating successful sim-to-real transfer on a real robot without fine-tuning, along with comprehensive ablations. The results show robust performance across diverse terrains and highlight the importance of AMP, curricula, and encoder design for stable, real-time posture-aware locomotion, establishing PALo as a foundation for higher-level embodied intelligence modules.
Abstract
With the rapid development of embodied intelligence, locomotion control of quadruped robots on complex terrains has become a research hotspot. Unlike traditional locomotion control approaches focusing solely on velocity tracking, we pursue to balance the agility and robustness of quadruped robots on diverse and complex terrains. To this end, we propose an end-to-end deep reinforcement learning framework for posture-aware locomotion named PALo, which manages to handle simultaneous linear and angular velocity tracking and real-time adjustments of body height, pitch, and roll angles. In PALo, the locomotion control problem is formulated as a partially observable Markov decision process, and an asymmetric actor-critic architecture is adopted to overcome the sim-to-real challenge. Further, by incorporating customized training curricula, PALo achieves agile posture-aware locomotion control in simulated environments and successfully transfers to real-world settings without fine-tuning, allowing real-time control of the quadruped robot's locomotion and body posture across challenging terrains. Through in-depth experimental analysis, we identify the key components of PALo that contribute to its performance, further validating the effectiveness of the proposed method. The results of this study provide new possibilities for the low-level locomotion control of quadruped robots in higher dimensional command spaces and lay the foundation for future research on upper-level modules for embodied intelligence.
