HiLMa-Res: A General Hierarchical Framework via Residual RL for Combining Quadrupedal Locomotion and Manipulation
Xiaoyu Huang, Qiayuan Liao, Yiming Ni, Zhongyu Li, Laura Smith, Sergey Levine, Xue Bin Peng, Koushil Sreenath
TL;DR
HiLMa-Res presents a general, hierarchical reinforcement learning framework that decouples locomotion control from manipulation planning for quadrupedal loco-manipulation. A task-independent operational-space locomotion controller tracks end-effector trajectories using nominal CPG-based motion plus residual Bézier corrections, while a task-specific planner outputs residual trajectories and base commands to accomplish diverse tasks. The system demonstrates ball dribbling, stepping over obstacles, and load navigation across simulation and real-world settings, with real-world fine-tuning via data-efficient methods and favorable comparisons to baselines. This modular approach enables fast adaptation to new loco-manipulation tasks and supports different observation modalities, including vision, making it practical for real-world deployment on quadrupeds.
Abstract
This work presents HiLMa-Res, a hierarchical framework leveraging reinforcement learning to tackle manipulation tasks while performing continuous locomotion using quadrupedal robots. Unlike most previous efforts that focus on solving a specific task, HiLMa-Res is designed to be general for various loco-manipulation tasks that require quadrupedal robots to maintain sustained mobility. The novel design of this framework tackles the challenges of integrating continuous locomotion control and manipulation using legs. It develops an operational space locomotion controller that can track arbitrary robot end-effector (toe) trajectories while walking at different velocities. This controller is designed to be general to different downstream tasks, and therefore, can be utilized in high-level manipulation planning policy to address specific tasks. To demonstrate the versatility of this framework, we utilize HiLMa-Res to tackle several challenging loco-manipulation tasks using a quadrupedal robot in the real world. These tasks span from leveraging state-based policy to vision-based policy, from training purely from the simulation data to learning from real-world data. In these tasks, HiLMa-Res shows better performance than other methods.
