Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities

Pierce Howell; Max Rudolph; Reza Torbati; Kevin Fu; Harish Ravichandar

Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities

Pierce Howell, Max Rudolph, Reza Torbati, Kevin Fu, Harish Ravichandar

TL;DR

The paper addresses generalization of heterogeneous multi-robot policies to unseen team compositions, sizes, and robots (adaptive teaming). It proposes capability-aware and capability-communicative policies built on Graph Neural Networks within a centralized training, decentralized execution framework, formalized as a Dec-POMDP augmented with a multi-dimensional capability space $\mathcal{C}$. The authors demonstrate, across two tasks (Heterogeneous Material Transport and Heterogeneous Sensor Network) in simulation and real-robot Robotarium experiments, that awareness and communication of capabilities enable robust zero-shot generalization and improved coordination, often outperforming agent-ID baselines. This approach has practical impact for deploying adaptable, heterogeneous robot teams in dynamic real-world environments with minimal retraining, and lays groundwork for extending capabilities and high-level coordination in future work.

Abstract

Recent advances in multi-agent reinforcement learning (MARL) are enabling impressive coordination in heterogeneous multi-robot teams. However, existing approaches often overlook the challenge of generalizing learned policies to teams of new compositions, sizes, and robots. While such generalization might not be important in teams of virtual agents that can retrain policies on-demand, it is pivotal in multi-robot systems that are deployed in the real-world and must readily adapt to inevitable changes. As such, multi-robot policies must remain robust to team changes -- an ability we call adaptive teaming. In this work, we investigate if awareness and communication of robot capabilities can provide such generalization by conducting detailed experiments involving an established multi-robot test bed. We demonstrate that shared decentralized policies, that enable robots to be both aware of and communicate their capabilities, can achieve adaptive teaming by implicitly capturing the fundamental relationship between collective capabilities and effective coordination. Videos of trained policies can be viewed at: https://sites.google.com/view/cap-comm

Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities

TL;DR

. The authors demonstrate, across two tasks (Heterogeneous Material Transport and Heterogeneous Sensor Network) in simulation and real-robot Robotarium experiments, that awareness and communication of capabilities enable robust zero-shot generalization and improved coordination, often outperforming agent-ID baselines. This approach has practical impact for deploying adaptable, heterogeneous robot teams in dynamic real-world environments with minimal retraining, and lays groundwork for extending capabilities and high-level coordination in future work.

Abstract

Paper Structure (26 sections, 4 equations, 15 figures, 3 tables)

This paper contains 26 sections, 4 equations, 15 figures, 3 tables.

Introduction
Related Work
Capability Awareness and Communication for Adaptive Teaming
Modeling Heterogeneous Multi-Robot Teams
Problem Description
Policy Architecture
Training Procedure
Experimental Design
Results
Heterogeneous Material Transport
Heterogeneous Sensor Network
Limitations
Conclusion
Heterogeneous Material Transport (HMT) Additional Results
Heterogeneous Sensor Network (HSN) Additional Results
...and 11 more sections

Figures (15)

Figure 1: We investigate the role of capability awareness and communication in generalizing decentralized heterogeneous multi-robot coordination policies to teams of new composition, size, and robots.
Figure 2: When evaluated on teams seen during training, capability-aware policies performed comparably to ID-based policies in terms of both average return (higher is better) and task-specific metrics (lower is better).
Figure 3: When generalizing to new team compositions and sizes in HMT, capability-based policies consistently outperformed ID-based policies in terms of average steps taken to meet the quota (lower is better).
Figure 4: When generalizing to new robots with unseen values for capabilities in HMT, policies that are only aware of capabilities (CA(MLP) and CA(GNN)) outperformed policies that also communicated capabilities (CA+CC(GNN)) in terms of average number of steps taken to transport the required material (lower is better).
Figure 5: When generalizing to new team compositions and sizes in HSN, capability-based policies consistently outperformed ID-based baselines in terms of average return (higher is better). Further, combining awareness and communication of capabilities resulted in the best generalization performance.
...and 10 more figures

Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities

TL;DR

Abstract

Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities

Authors

TL;DR

Abstract

Table of Contents

Figures (15)