Table of Contents
Fetching ...

Multi-Quadruped Cooperative Object Transport: Learning Decentralized Pinch-Lift-Move

Bikram Pandit, Aayam Kumar Shrestha, Alan Fern

TL;DR

This work addresses the more challenging setting where mechanically independent robots must coordinate through contact forces alone without any communication or centralized control, and employs a hierarchical policy architecture that separates base locomotion from arm control.

Abstract

We study decentralized cooperative transport using teams of N-quadruped robots with arm that must pinch, lift, and move ungraspable objects through physical contact alone. Unlike prior work that relies on rigid mechanical coupling between robots and objects, we address the more challenging setting where mechanically independent robots must coordinate through contact forces alone without any communication or centralized control. To this end, we employ a hierarchical policy architecture that separates base locomotion from arm control, and propose a constellation reward formulation that unifies position and orientation tracking to enforce rigid contact behavior. The key insight is encouraging robots to behave as if rigidly connected to the object through careful reward design and training curriculum rather than explicit mechanical constraints. Our approach enables coordination through shared policy parameters and implicit synchronization cues - scaling to arbitrary team sizes without retraining. We show extensive simulation experiments to demonstrate robust transport across 2-10 robots on diverse object geometries and masses, along with sim2real transfer results on lightweight objects.

Multi-Quadruped Cooperative Object Transport: Learning Decentralized Pinch-Lift-Move

TL;DR

This work addresses the more challenging setting where mechanically independent robots must coordinate through contact forces alone without any communication or centralized control, and employs a hierarchical policy architecture that separates base locomotion from arm control.

Abstract

We study decentralized cooperative transport using teams of N-quadruped robots with arm that must pinch, lift, and move ungraspable objects through physical contact alone. Unlike prior work that relies on rigid mechanical coupling between robots and objects, we address the more challenging setting where mechanically independent robots must coordinate through contact forces alone without any communication or centralized control. To this end, we employ a hierarchical policy architecture that separates base locomotion from arm control, and propose a constellation reward formulation that unifies position and orientation tracking to enforce rigid contact behavior. The key insight is encouraging robots to behave as if rigidly connected to the object through careful reward design and training curriculum rather than explicit mechanical constraints. Our approach enables coordination through shared policy parameters and implicit synchronization cues - scaling to arbitrary team sizes without retraining. We show extensive simulation experiments to demonstrate robust transport across 2-10 robots on diverse object geometries and masses, along with sim2real transfer results on lightweight objects.

Paper Structure

This paper contains 19 sections, 7 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Overview of decPLM (Decentralized Pinch-Lift-Move). Top-Left: Policy structure. Each robot runs the same decentralized policy that receives local proprioception (s), contact-frame command $c_{\text{cf}}$, and (optionally) contact frame pose in base frame ${}^bT_{\text{cf}}$, and outputs arm joint targets and base velocity commands. Bottom-Left: Training setup with two robots in simulation on a box. Middle: Generalization to larger teams and diverse payloads, including a box, a log, a barrel, and a couch. Right: Sim2Real Demonstration with 2, 3 and 4 Robots, each running decPLM$\left(\text{const}^+, \text{cf}^{\text{init}}\right)$ policy independently, without any inter-robot communication.
  • Figure 2: Reward activation schedule: Each row lists a reward term along with the command phase (pinch, lift, move) in which it is used. The blue bars show when, within an episode, each reward is active. This illustrates how different rewards are introduced and shaped across both training phases and episode time.
  • Figure 3: Constellation reward illustration. Source points (blue) on the pad and base are matched to their corresponding target points (green). Dotted orange lines show the errors that the policy must minimize, with the two groups representing End-effector Contact Constellation and Base Tracking Constellation.
  • Figure 4: Robot arrangements with different team sizes. Robots are evenly distributed around the box ($1.0 \times 1.5 \times 0.7$ m, mass $2$ kg), as used in the experiments of Fig. \ref{['fig:comparing_box_info_and_constellation_methods']}. For the two-robot case, a smaller box ($0.5 \times 0.4 \times 0.7$ m) is used. Robot indexing, when referenced, starts from the top-right corner and increases clockwise around the box.
  • Figure 5: We compare performance across different team sizes for four model variants: decPLM$\left(\text{const}^-, \text{cf}^{\text{init}}\right)$, decPLM$\left(\text{const}^-, \text{cf}^+\right)$, decPLM$\left(\text{const}^+, \text{cf}^{\text{init}}\right)$, and decPLM$\left(\text{const}^+, \text{cf}^+\right)$. Line style indicates access to contact-frame pose (solid = cf$^+$ vs. dotted = cf$^{\text{init}}$), marker shape indicates use of the constellation reward (circle = const$^+$ vs. cross = const$^-$, and colors correspond to the methods as listed. We also include a 1-robot baseline, which does not carry any load but is provided as an idealized reference value. Finally, $\text{decPLM}_{\text{3r}}$ is a variant trained with three robots instead of two. For details, see Subsection \ref{['subsec:consellation-reward-effectiveness']} and \ref{['subsec:object-mass-scaling']}.
  • ...and 2 more figures