Table of Contents
Fetching ...

HOLA-Drone: Hypergraphic Open-ended Learning for Zero-Shot Multi-Drone Cooperative Pursuit

Yang Li, Dengyu Zhang, Junfan Chen, Ying Wen, Qingrui Zhang, Shaoshuai Mou, Wei Pan

TL;DR

The paper addresses zero-shot coordination in a multi-drone pursuit setting by formulating the task as a Dec-POMDP and introducing HOLA-Drone, a hypergraphic open-ended learning framework. Through HyFoG (Hypergraphic-Form Game) and a preference hypergraph, the method continuously adapts training objectives to enhance coordination with unseen teammates. Empirical results in simulation and real-world Crazyflie experiments show that HOLA-Drone outperforms baselines, with ablation confirming the critical role of the $oldsymbol{ ext{phi}}$ Solver and hypergraph-based training. This work extends ZSC from two-player games to multi-agent, physically grounded drone teams, offering a scalable approach for robust, adaptive cooperative pursuit in unknown partner settings.

Abstract

Zero-shot coordination (ZSC) is a significant challenge in multi-agent collaboration, aiming to develop agents that can coordinate with unseen partners they have not encountered before. Recent cutting-edge ZSC methods have primarily focused on two-player video games such as OverCooked!2 and Hanabi. In this paper, we extend the scope of ZSC research to the multi-drone cooperative pursuit scenario, exploring how to construct a drone agent capable of coordinating with multiple unseen partners to capture multiple evaders. We propose a novel Hypergraphic Open-ended Learning Algorithm (HOLA-Drone) that continuously adapts the learning objective based on our hypergraphic-form game modeling, aiming to improve cooperative abilities with multiple unknown drone teammates. To empirically verify the effectiveness of HOLA-Drone, we build two different unseen drone teammate pools to evaluate their performance in coordination with various unseen partners. The experimental results demonstrate that HOLA-Drone outperforms the baseline methods in coordination with unseen drone teammates. Furthermore, real-world experiments validate the feasibility of HOLA-Drone in physical systems. Videos can be found on the project homepage~\url{https://sites.google.com/view/hola-drone}.

HOLA-Drone: Hypergraphic Open-ended Learning for Zero-Shot Multi-Drone Cooperative Pursuit

TL;DR

The paper addresses zero-shot coordination in a multi-drone pursuit setting by formulating the task as a Dec-POMDP and introducing HOLA-Drone, a hypergraphic open-ended learning framework. Through HyFoG (Hypergraphic-Form Game) and a preference hypergraph, the method continuously adapts training objectives to enhance coordination with unseen teammates. Empirical results in simulation and real-world Crazyflie experiments show that HOLA-Drone outperforms baselines, with ablation confirming the critical role of the Solver and hypergraph-based training. This work extends ZSC from two-player games to multi-agent, physically grounded drone teams, offering a scalable approach for robust, adaptive cooperative pursuit in unknown partner settings.

Abstract

Zero-shot coordination (ZSC) is a significant challenge in multi-agent collaboration, aiming to develop agents that can coordinate with unseen partners they have not encountered before. Recent cutting-edge ZSC methods have primarily focused on two-player video games such as OverCooked!2 and Hanabi. In this paper, we extend the scope of ZSC research to the multi-drone cooperative pursuit scenario, exploring how to construct a drone agent capable of coordinating with multiple unseen partners to capture multiple evaders. We propose a novel Hypergraphic Open-ended Learning Algorithm (HOLA-Drone) that continuously adapts the learning objective based on our hypergraphic-form game modeling, aiming to improve cooperative abilities with multiple unknown drone teammates. To empirically verify the effectiveness of HOLA-Drone, we build two different unseen drone teammate pools to evaluate their performance in coordination with various unseen partners. The experimental results demonstrate that HOLA-Drone outperforms the baseline methods in coordination with unseen drone teammates. Furthermore, real-world experiments validate the feasibility of HOLA-Drone in physical systems. Videos can be found on the project homepage~\url{https://sites.google.com/view/hola-drone}.
Paper Structure (26 sections, 8 equations, 5 figures, 3 tables)

This paper contains 26 sections, 8 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Top row: Classic multi-agent reinforcement learning (MARL) with centralized training and decentralized execution (CTDE) framework. - Training Phase: All agents are updated collectively. - Evaluation Phase: The same agents that were involved in the training phase are deployed. The evaluation assesses the collective performance of these pre-trained agents in achieving the task using the strategies learned during training. Bottom row: Proposed zero-shot cooperative multi-drone pursuit scheme. - Training Phase: A single learner agent is trained by co-playing with a set of non-learnable partners. - Evaluation Phase: The learner agent is required to coordinate with previously unseen partner agents that were not part of its training. The goal is to assess the learner's zero-shot coordination ability with any new, unseen partner without additional updating.
  • Figure 2: Hypergraphic Open-ended Learning Algorithm: Detailed illustration of a single generation within the open-ended learning phase, including the Grapher and Oracle modules.
  • Figure 3: Schematic diagrams of hypergraph representation of HyFog (left figure) and its preference hypergraph (right figure). The hyper-preference centrality is calculated using in-degree centrality.
  • Figure 4: (a) Schematic diagram of the cooperative drone pursuit environment with 3 pursuers and 2 evaders. (b) Snapshots of the real-world experiment from the top view. In timestep (ii) and timestep (iii), the pursuers successfully capture the two evaders, respectively.
  • Figure 5: Comparison of Task Success Rate (first column, higher is better), Collision Rate (second column, lower is better), and Mean Episode Length (third column, lower is better) among four baseline methods, one ablation method $\text{HOLA-Drone}_R$, and our proposed HOLA-Drone in the 3-Pursuer-2-Evader Scenario when playing with both Homogeneous Teammates (HoT) and Random Heterogeneous Teammates (HeT). The first row depicts the performances with two homogeneous teammates. The results for the SP method (slashed bar) are obtained from co-playing with the same algorithms and should be excluded from the comparison.The second row shows the results obtained from co-playing with random teammates sampled from an unseen teammate pool. The means and standard deviations, indicated by the error bars, are calculated over three different random seeds, with each seed undergoing 50 repeated runs.

Theorems & Definitions (2)

  • Definition 4.1: Hypergraphic-Form Game
  • Definition 4.2: Preference Hypergraph