Dual-Quadruped Collaborative Transportation in Narrow Environments via Safe Reinforcement Learning
Zhezhi Lei, Zhihai Bi, Wenxin Wang, Jun Ma
TL;DR
This work tackles safe, decentralized collaborative payload transport by framing dual-quadruped coordination as a fully cooperative constrained Markov game with a shared safety budget $u$. It introduces cost-advantage decomposition to enable stable, monotonic improvement under shared constraints and a constraint allocation mechanism that distributes budgets among robots, guided by Bayesian optimization and a Lagrangian-based training loop. The approach uses two separate critics for reward and cost, trust-region updates, and KL bounds to ensure safety while promoting collaboration, demonstrated through simulations and real-world tests (gate, corridor, forest). Results show superior safety (lower collision probability), efficient collaboration (straighter, shorter trajectories), and adaptive formation reconfiguration in narrow environments, outperforming baseline cost-aware and reward-only methods. The framework provides a practical pathway for reliable multi-robot transportation in constrained settings with distributed control and explicit safety guarantees.
Abstract
Collaborative transportation, where multiple robots collaboratively transport a payload, has garnered significant attention in recent years. While ensuring safe and high-performance inter-robot collaboration is critical for effective task execution, it is difficult to pursue in narrow environments where the feasible region is extremely limited. To address this challenge, we propose a novel approach for dual-quadruped collaborative transportation via safe reinforcement learning (RL). Specifically, we model the task as a fully cooperative constrained Markov game, where collision avoidance is formulated as constraints. We introduce a cost-advantage decomposition method that enforces the sum of team constraints to remain below an upper bound, thereby guaranteeing task safety within an RL framework. Furthermore, we propose a constraint allocation method that assigns shared constraints to individual robots to maximize the overall task reward, encouraging autonomous task-assignment among robots, thereby improving collaborative task performance. Simulation and real-time experimental results demonstrate that the proposed approach achieves superior performance and a higher success rate in dual-quadruped collaborative transportation compared to existing methods.
