Table of Contents
Fetching ...

POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding

Alexey Skrynnik, Anton Andreychuk, Anatolii Borzilov, Alexander Chernyavskiy, Konstantin Yakovlev, Aleksandr Panov

TL;DR

POGEMA tackles the lack of a unified benchmark for cooperative MAPF by delivering a fast, Python-based environment, a problem-instance generator, a visualization toolkit, and a benchmarking suite with a domain-specific evaluation protocol. It enables fair comparisons across pure MARL, hybrid, and planning-based methods and provides a diverse set of baselines including centralized planners like LaCAM and RHCR, as well as state-of-the-art hybrids SCRIMP and DCC. Across MAPF and Lifelong MAPF, centralized planners frequently lead on key metrics, with hybrids maintaining a strong edge over pure MARL in many scenarios; MARL approaches can match or exceed some baselines under Lifelong MAPF, illustrating the value of shared information and planning components. The platform’s procedural map generation, detailed metrics, and distributed evaluation capabilities offer a practical means to study generalization, scalability, and coordination, with potential impact on real-world robotics and warehouse automation research.

Abstract

Multi-agent reinforcement learning (MARL) has recently excelled in solving challenging cooperative and competitive multi-agent problems in various environments, typically involving a small number of agents and full observability. Moreover, a range of crucial robotics-related tasks, such as multi-robot pathfinding, which have traditionally been approached with classical non-learnable methods (e.g., heuristic search), are now being suggested for solution using learning-based or hybrid methods. However, in this domain, it remains difficult, if not impossible, to conduct a fair comparison between classical, learning-based, and hybrid approaches due to the lack of a unified framework that supports both learning and evaluation. To address this, we introduce POGEMA, a comprehensive set of tools that includes a fast environment for learning, a problem instance generator, a collection of predefined problem instances, a visualization toolkit, and a benchmarking tool for automated evaluation. We also introduce and define an evaluation protocol that specifies a range of domain-related metrics, computed based on primary evaluation indicators (such as success rate and path length), enabling a fair multi-fold comparison. The results of this comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.

POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding

TL;DR

POGEMA tackles the lack of a unified benchmark for cooperative MAPF by delivering a fast, Python-based environment, a problem-instance generator, a visualization toolkit, and a benchmarking suite with a domain-specific evaluation protocol. It enables fair comparisons across pure MARL, hybrid, and planning-based methods and provides a diverse set of baselines including centralized planners like LaCAM and RHCR, as well as state-of-the-art hybrids SCRIMP and DCC. Across MAPF and Lifelong MAPF, centralized planners frequently lead on key metrics, with hybrids maintaining a strong edge over pure MARL in many scenarios; MARL approaches can match or exceed some baselines under Lifelong MAPF, illustrating the value of shared information and planning components. The platform’s procedural map generation, detailed metrics, and distributed evaluation capabilities offer a practical means to study generalization, scalability, and coordination, with potential impact on real-world robotics and warehouse automation research.

Abstract

Multi-agent reinforcement learning (MARL) has recently excelled in solving challenging cooperative and competitive multi-agent problems in various environments, typically involving a small number of agents and full observability. Moreover, a range of crucial robotics-related tasks, such as multi-robot pathfinding, which have traditionally been approached with classical non-learnable methods (e.g., heuristic search), are now being suggested for solution using learning-based or hybrid methods. However, in this domain, it remains difficult, if not impossible, to conduct a fair comparison between classical, learning-based, and hybrid approaches due to the lack of a unified framework that supports both learning and evaluation. To address this, we introduce POGEMA, a comprehensive set of tools that includes a fast environment for learning, a problem instance generator, a collection of predefined problem instances, a visualization toolkit, and a benchmarking tool for automated evaluation. We also introduce and define an evaluation protocol that specifies a range of domain-related metrics, computed based on primary evaluation indicators (such as success rate and path length), enabling a fair multi-fold comparison. The results of this comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.
Paper Structure (48 sections, 6 equations, 18 figures, 13 tables)

This paper contains 48 sections, 6 equations, 18 figures, 13 tables.

Figures (18)

  • Figure 1: (a) Example of the multi-agent pathfinding problem considered in POGEMA: each agent must reach its goal, denoted by a flag of the same color. (b) Observation tensor of the red agent. (c) Evaluation results of MARL, hybrid, and search-based solvers on POGEMA benchmark.
  • Figure 2: Examples of map generators presented in POGEMA.
  • Figure 3: Evaluation of baselines available in POGEMA on (a) MAPF (b) LMAPF instances.
  • Figure 4: Performance of MAPF approaches on Random and Mazes maps, based on CSR (higher is better) and SoC (lower is better) metrics. The shaded area indicates $95\%$ confidence intervals.
  • Figure 5: Performance of MAPF approaches on Cities-tiles maps. These results were utilized to compute Out-of-Distribution metric. The shaded area indicates $95\%$ confidence intervals.
  • ...and 13 more figures