Table of Contents
Fetching ...

Cooperative Hybrid Multi-Agent Pathfinding Based on Shared Exploration Maps

Ning Liu, Sen Shen, Xiangrui Kong, Hongtao Zhang, Thomas Bräunl

TL;DR

This work tackles cooperative MAPF in incomplete and dynamic environments by fusing D* Lite global search with multi-agent reinforcement learning in a hybrid CHS framework. A switching mechanism and anti-freezing strategy balance global path optimality and local adaptability, while a shared incremental exploration map and per-agent grid memory enable scalable, partially observable coordination with reduced communication overhead. Empirical results in PO-GEMA-like simulations and the EyeSim platform show CHS achieving higher success rates and better collision avoidance and path efficiency, especially in large-scale, congested scenarios. The approach promises practical applicability for real-time multi-robot systems with evolving environments by maintaining robust performance without full global information exchange.

Abstract

Multi-Agent Pathfinding is used in areas including multi-robot formations, warehouse logistics, and intelligent vehicles. However, many environments are incomplete or frequently change, making it difficult for standard centralized planning or pure reinforcement learning to maintain both global solution quality and local flexibility. This paper introduces a hybrid framework that integrates D* Lite global search with multi-agent reinforcement learning, using a switching mechanism and a freeze-prevention strategy to handle dynamic conditions and crowded settings. We evaluate the framework in the discrete POGEMA environment and compare it with baseline methods. Experimental outcomes indicate that the proposed framework substantially improves success rate, collision rate, and path efficiency. The model is further tested on the EyeSim platform, where it maintains feasible Pathfinding under frequent changes and large-scale robot deployments.

Cooperative Hybrid Multi-Agent Pathfinding Based on Shared Exploration Maps

TL;DR

This work tackles cooperative MAPF in incomplete and dynamic environments by fusing D* Lite global search with multi-agent reinforcement learning in a hybrid CHS framework. A switching mechanism and anti-freezing strategy balance global path optimality and local adaptability, while a shared incremental exploration map and per-agent grid memory enable scalable, partially observable coordination with reduced communication overhead. Empirical results in PO-GEMA-like simulations and the EyeSim platform show CHS achieving higher success rates and better collision avoidance and path efficiency, especially in large-scale, congested scenarios. The approach promises practical applicability for real-time multi-robot systems with evolving environments by maintaining robust performance without full global information exchange.

Abstract

Multi-Agent Pathfinding is used in areas including multi-robot formations, warehouse logistics, and intelligent vehicles. However, many environments are incomplete or frequently change, making it difficult for standard centralized planning or pure reinforcement learning to maintain both global solution quality and local flexibility. This paper introduces a hybrid framework that integrates D* Lite global search with multi-agent reinforcement learning, using a switching mechanism and a freeze-prevention strategy to handle dynamic conditions and crowded settings. We evaluate the framework in the discrete POGEMA environment and compare it with baseline methods. Experimental outcomes indicate that the proposed framework substantially improves success rate, collision rate, and path efficiency. The model is further tested on the EyeSim platform, where it maintains feasible Pathfinding under frequent changes and large-scale robot deployments.

Paper Structure

This paper contains 16 sections, 6 equations, 7 figures, 2 tables, 2 algorithms.

Figures (7)

  • Figure 1: Framework of CHS. The system detects loops to trigger reinforcement learning actions, or uses a switching mechanism to select between D* Lite and EPOM based on agent density. The EPOM module processes observations through a ResNet encoder and GRU networks. This architecture enables dynamic switching between global and local planning while maintaining path consistency.
  • Figure 2: The shared map provides global information that is updated in small increments from local observations.
  • Figure 3: Results on a $20\times20$ grid map for exploration length and success rate.
  • Figure 4: Shared Exploration Maps Ablation
  • Figure 5: Loop Detection Ablation
  • ...and 2 more figures