Whole-body End-Effector Pose Tracking
Tifanny Portela, Andrei Cramariuc, Mayank Mittal, Marco Hutter
TL;DR
The paper tackles the challenge of end-effector pose tracking for legged robots with attached manipulators, aiming to operate over large workspaces and on rough terrains. It introduces a whole-body RL framework trained in simulation with terrain-aware command sampling, a keypoint-based end-effector pose representation, and curriculum learning to expand reachable workspace, then transfers the policy to hardware. The approach achieves high-precision tracking (as low as 2.64 cm and 3.64° on stairs) and demonstrates strong sim-to-real transfer, outperforming model-based MPC on flat terrain and surpassing prior RL methods in workspace reach and robustness. This work broadens the practical capabilities of legged robots for manipulation tasks in unstructured environments, reducing reliance on accurate dynamic models and enabling more versatile real-world operation.
Abstract
Combining manipulation with the mobility of legged robots is essential for a wide range of robotic applications. However, integrating an arm with a mobile base significantly increases the system's complexity, making precise end-effector control challenging. Existing model-based approaches are often constrained by their modeling assumptions, leading to limited robustness. Meanwhile, recent Reinforcement Learning (RL) implementations restrict the arm's workspace to be in front of the robot or track only the position to obtain decent tracking accuracy. In this work, we address these limitations by introducing a whole-body RL formulation for end-effector pose tracking in a large workspace on rough, unstructured terrains. Our proposed method involves a terrain-aware sampling strategy for the robot's initial configuration and end-effector pose commands, as well as a game-based curriculum to extend the robot's operating range. We validate our approach on the ANYmal quadrupedal robot with a six DoF robotic arm. Through our experiments, we show that the learned controller achieves precise command tracking over a large workspace and adapts across varying terrains such as stairs and slopes. On deployment, it achieves a pose-tracking error of 2.64 cm and 3.64 degrees, outperforming existing competitive baselines.
