Table of Contents
Fetching ...

Multi-party Agent Relation Sampling for Multi-party Ad Hoc Teamwork

Beiwen Zhang, Yongheng Liang, Hejun Wu

TL;DR

The paper addresses the challenge of ad hoc teamwork in multi party settings where controlled agents must coordinate with multiple unfamiliar groups. It introduces MAHT and the MARS framework that builds a sparse agent skeleton and uses a Relational Forward Model to capture cross group dynamics while learning cooperative embeddings to condition policies. Empirical results on MPE and StarCraft II show that MARS achieves stronger coordination and faster convergence than representative MARL and AHT baselines, with ablations confirming the importance of the RFM and the skeleton. The work advances scalable, cross group coordination in open multi agent environments with practical implications for real world ad hoc collaboration.

Abstract

Multi-agent reinforcement learning (MARl) has achieved strong results in cooperative tasks but typically assumes fixed, fully controlled teams. Ad hoc teamwork (AHT) relaxes this by allowing collaboration with unknown partners, yet existing variants still presume shared conventions. We introduce Multil-party Ad Hoc Teamwork (MAHT), where controlled agents must coordinate with multiple mutually unfamiliar groups of uncontrolled teammates. To address this, we propose MARs, which builds a sparse skeleton graph and applies relational modeling to capture cross-group dvnamics. Experiments on MPE and starCralt ll show that MARs outperforms MARL and AHT baselines while converging faster.

Multi-party Agent Relation Sampling for Multi-party Ad Hoc Teamwork

TL;DR

The paper addresses the challenge of ad hoc teamwork in multi party settings where controlled agents must coordinate with multiple unfamiliar groups. It introduces MAHT and the MARS framework that builds a sparse agent skeleton and uses a Relational Forward Model to capture cross group dynamics while learning cooperative embeddings to condition policies. Empirical results on MPE and StarCraft II show that MARS achieves stronger coordination and faster convergence than representative MARL and AHT baselines, with ablations confirming the importance of the RFM and the skeleton. The work advances scalable, cross group coordination in open multi agent environments with practical implications for real world ad hoc collaboration.

Abstract

Multi-agent reinforcement learning (MARl) has achieved strong results in cooperative tasks but typically assumes fixed, fully controlled teams. Ad hoc teamwork (AHT) relaxes this by allowing collaboration with unknown partners, yet existing variants still presume shared conventions. We introduce Multil-party Ad Hoc Teamwork (MAHT), where controlled agents must coordinate with multiple mutually unfamiliar groups of uncontrolled teammates. To address this, we propose MARs, which builds a sparse skeleton graph and applies relational modeling to capture cross-group dvnamics. Experiments on MPE and starCralt ll show that MARs outperforms MARL and AHT baselines while converging faster.

Paper Structure

This paper contains 11 sections, 4 equations, 2 figures.

Figures (2)

  • Figure 1: Illustration of different Ad Hoc Teamwork (AHT) settings. In the standard AHT, only a single agent is controlled by the learning algorithm, while the remaining agents exhibit diverse and unknown behaviors. OAHT extends this setting to allow team-size dynamics, where uncontrolled teammates may join or leave during the task. NAHT assumes that a potentially varying number $N$ of agents are controlled by the learning algorithm, with the rest being uncontrolled. Finally, MAHT considers that multiple controlled agents must cooperate and coordinate with multiple unfamiliar teams of uncontrolled agents.
  • Figure 2: Illustration of the Multi-party Agent Relation Sampling (MARS) framework. MARS enhances MAHT by enabling controlled agents to cooperate with unfamiliar teammates and sustain coordination across varying groups through three integrated stages: agent behavior encoding, relational reasoning over sparse interactions, and policy optimization guided by cooperative representations.