Whole-body End-Effector Pose Tracking

Tifanny Portela; Andrei Cramariuc; Mayank Mittal; Marco Hutter

Whole-body End-Effector Pose Tracking

Tifanny Portela, Andrei Cramariuc, Mayank Mittal, Marco Hutter

TL;DR

The paper tackles the challenge of end-effector pose tracking for legged robots with attached manipulators, aiming to operate over large workspaces and on rough terrains. It introduces a whole-body RL framework trained in simulation with terrain-aware command sampling, a keypoint-based end-effector pose representation, and curriculum learning to expand reachable workspace, then transfers the policy to hardware. The approach achieves high-precision tracking (as low as 2.64 cm and 3.64° on stairs) and demonstrates strong sim-to-real transfer, outperforming model-based MPC on flat terrain and surpassing prior RL methods in workspace reach and robustness. This work broadens the practical capabilities of legged robots for manipulation tasks in unstructured environments, reducing reliance on accurate dynamic models and enabling more versatile real-world operation.

Abstract

Combining manipulation with the mobility of legged robots is essential for a wide range of robotic applications. However, integrating an arm with a mobile base significantly increases the system's complexity, making precise end-effector control challenging. Existing model-based approaches are often constrained by their modeling assumptions, leading to limited robustness. Meanwhile, recent Reinforcement Learning (RL) implementations restrict the arm's workspace to be in front of the robot or track only the position to obtain decent tracking accuracy. In this work, we address these limitations by introducing a whole-body RL formulation for end-effector pose tracking in a large workspace on rough, unstructured terrains. Our proposed method involves a terrain-aware sampling strategy for the robot's initial configuration and end-effector pose commands, as well as a game-based curriculum to extend the robot's operating range. We validate our approach on the ANYmal quadrupedal robot with a six DoF robotic arm. Through our experiments, we show that the learned controller achieves precise command tracking over a large workspace and adapts across varying terrains such as stairs and slopes. On deployment, it achieves a pose-tracking error of 2.64 cm and 3.64 degrees, outperforming existing competitive baselines.

Whole-body End-Effector Pose Tracking

TL;DR

Abstract

Paper Structure (21 sections, 5 equations, 5 figures, 1 table)

This paper contains 21 sections, 5 equations, 5 figures, 1 table.

INTRODUCTION
RELATED WORK
Model-based whole-body control
Learning-based whole-body control
METHOD
Policy Architecture
Command sampling
Command definition
Action and Observation Space
Rewards
Terrains and Curriculum training
Initial poses
Sim-to-Real
RESULTS AND DISCUSSION
Simulation experiments
...and 6 more sections

Figures (5)

Figure 1: Our whole-body controller demonstrates precise end-effector pose tracking across a variety of challenging terrains, including soft mattresses, stairs and uneven natural ground.
Figure 2: The training process begins with data collection, where we gather (A) the terrain mesh and a coarse terrain height map, (B) 10000 pre-sampled end-effector pose commands with a fixed base, and (C) base poses and joint angles from a pre-trained locomotion policy to initialize robots. During training, a command from (B) is slightly transformed and checked for collisions with the terrain. If collision-free, it is concatenated with observations and input to the policy; otherwise, a new command is sampled. The policy is trained in simulation with 4000 robots in parallel, outputting joint actions.
Figure 3: Front and side views of the workspace with 10000 collision-free end-effector poses (subsampled to 250 and illustrating only positions for readability).
Figure 4: Distribution of the position and orientation errors for 10000 end-effector pose commands measured on flat terrain in simulation for four different pose representations.
Figure 5: Distribution of the position and orientation errors for 20 end-effector pose commands, measured on hardware, on both flat terrain and stairs.

Whole-body End-Effector Pose Tracking

TL;DR

Abstract

Whole-body End-Effector Pose Tracking

Authors

TL;DR

Abstract

Table of Contents

Figures (5)