Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real
Ofir Nachum, Michael Ahn, Hugo Ponte, Shixiang Gu, Vikash Kumar
TL;DR
This paper tackles multi-agent manipulation via locomotion by introducing hierarchical sim2real, where a low-level locomotion policy is learned in simulation and a high-level controller learns task directives that steer the low-level policy. Training is performed in two phases and leverages domain randomization at each level to achieve zero-shot transfer to real-world robots. The method is validated on three real-world quadrupedal tasks—Avoid, Push, and Coordinate—showing that hierarchy plus targeted randomization yields robust real-world performance, including a successful demonstration of coordinated multi-agent manipulation. The work highlights modularity in sim2real and suggests that hierarchical structures simplify bridging the sim-to-real gap for complex, interactive robotics tasks.
Abstract
Manipulation and locomotion are closely related problems that are often studied in isolation. In this work, we study the problem of coordinating multiple mobile agents to exhibit manipulation behaviors using a reinforcement learning (RL) approach. Our method hinges on the use of hierarchical sim2real -- a simulated environment is used to learn low-level goal-reaching skills, which are then used as the action space for a high-level RL controller, also trained in simulation. The full hierarchical policy is then transferred to the real world in a zero-shot fashion. The application of domain randomization during training enables the learned behaviors to generalize to real-world settings, while the use of hierarchy provides a modular paradigm for learning and transferring increasingly complex behaviors. We evaluate our method on a number of real-world tasks, including coordinated object manipulation in a multi-agent setting. See videos at https://sites.google.com/view/manipulation-via-locomotion
