Heterogeneous Multi-agent Zero-Shot Coordination by Coevolution

Ke Xue; Yutong Wang; Cong Guan; Lei Yuan; Haobo Fu; Qiang Fu; Chao Qian; Yang Yu

Heterogeneous Multi-agent Zero-Shot Coordination by Coevolution

Ke Xue, Yutong Wang, Cong Guan, Lei Yuan, Haobo Fu, Qiang Fu, Chao Qian, Yang Yu

TL;DR

This work addresses zero-shot coordination with unseen, heterogeneous partners in cooperative MARL by introducing MAZE, a coevolutionary framework that maintains two policy populations (agents and partners) and evolves them through pairing, updating, and selection while preserving diversity via an archive. The approach leverages a PPO-like update augmented with a diversity term and uses Jensen-Shannon Divergence across population policies to encourage varied behaviors, enabling robust coordination with novel partners. Empirical validation in Overcooked and FillInTheGrid across multiple heterogeneous layouts shows MAZE outperforms self-play and prior diversity-based baselines, and human-in-the-loop experiments corroborate improved real-human coordination. The work highlights heterogeneity as essential in ZSC and offers a scalable, modular framework that can be integrated with existing ZSC techniques and extended to broader heterogeneous multi-agent tasks.

Abstract

Generating agents that can achieve zero-shot coordination (ZSC) with unseen partners is a new challenge in cooperative multi-agent reinforcement learning (MARL). Recently, some studies have made progress in ZSC by exposing the agents to diverse partners during the training process. They usually involve self-play when training the partners, implicitly assuming that the tasks are homogeneous. However, many real-world tasks are heterogeneous, and hence previous methods may be inefficient. In this paper, we study the heterogeneous ZSC problem for the first time and propose a general method based on coevolution, which coevolves two populations of agents and partners through three sub-processes: pairing, updating and selection. Experimental results on various heterogeneous tasks highlight the necessity of considering the heterogeneous setting and demonstrate that our proposed method is a promising solution for heterogeneous ZSC tasks.

Heterogeneous Multi-agent Zero-Shot Coordination by Coevolution

TL;DR

Abstract

Paper Structure (36 sections, 2 equations, 10 figures, 8 tables, 1 algorithm)

This paper contains 36 sections, 2 equations, 10 figures, 8 tables, 1 algorithm.

Introduction
Background
Related work
Zero-Shot Coordination
Diversity in RL
Coevolution
MAZE Method
Pairing
Updating
Selection
Deployment
Experiments
Environments
Experimental settings
SP
...and 21 more sections

Figures (10)

Figure 1: An example illustration of Human-AI coordination with increasing heterogeneity.
Figure 2: Illustration of the MAZE method, where MAZE coevolves two populations of agents and partners through three sub-processes, i.e., pairing, updating and selection.
Figure 3: The AA layout of Overcooked environment . The Overcooked environment is a two-player common-payoff game, where players need to coordinate to cook and deliver soup.
Figure 4: Illustration of different layouts on Overcooked.
Figure 5: FillInTheGrid environments. (a) Grid-T layout. (b) Grid-L layout.
...and 5 more figures

Heterogeneous Multi-agent Zero-Shot Coordination by Coevolution

TL;DR

Abstract

Heterogeneous Multi-agent Zero-Shot Coordination by Coevolution

Authors

TL;DR

Abstract

Table of Contents

Figures (10)