A Framework for Deploying Learning-based Quadruped Loco-Manipulation

Yadong Liu; Jianwei Liu; He Liang; Dimitrios Kanoulas

A Framework for Deploying Learning-based Quadruped Loco-Manipulation

Yadong Liu, Jianwei Liu, He Liang, Dimitrios Kanoulas

TL;DR

The paper presents an open, end-to-end framework for training, benchmarking, and deploying reinforcement-learning–based loco-manipulation on the Unitree B1+Z1 platform, enabling robust sim-to-sim and sim-to-real transfer via a ROS-based stack and a MuJoCo hardware interface. It reveals cross-simulator differences in contact modeling that affect policy behavior and demonstrates that coordinated whole-body control enhances manipulation reach and stability in real-world teleoperation. Key contributions include a unified deployment pipeline, sim2sim and sim2real implementations, and a foot-contact estimation mechanism that supports safe real-time operation. The findings underscore the importance of contact modeling robustness and present mid-pose configurations as a practical guideline for balancing locomotion stability and manipulation reach in real-world deployments. Overall, the work advances reproducible RL-based loco-manipulation on legged mobile manipulators and lays groundwork for open, comparative evaluation in future studies.

Abstract

Quadruped mobile manipulators offer strong potential for agile loco-manipulation but remain difficult to control and transfer reliably from simulation to reality. Reinforcement learning (RL) shows promise for whole-body control, yet most frameworks are proprietary and hard to reproduce on real hardware. We present an open pipeline for training, benchmarking, and deploying RL-based controllers on the Unitree B1 quadruped with a Z1 arm. The framework unifies sim-to-sim and sim-to-real transfer through ROS, re-implementing a policy trained in Isaac Gym, extending it to MuJoCo via a hardware abstraction layer, and deploying the same controller on physical hardware. Sim-to-sim experiments expose discrepancies between Isaac Gym and MuJoCo contact models that influence policy behavior, while real-world teleoperated object-picking trials show that coordinated whole-body control extends reach and improves manipulation over floating-base baselines. The pipeline provides a transparent, reproducible foundation for developing and analyzing RL-based loco-manipulation controllers and will be released open source to support future research.

A Framework for Deploying Learning-based Quadruped Loco-Manipulation

TL;DR

Abstract

A Framework for Deploying Learning-based Quadruped Loco-Manipulation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)