Table of Contents
Fetching ...

Deadlock-Free Hybrid RL-MAPF Framework for Zero-Shot Multi-Robot Navigation

Haoyi Wang, Licheng Luo, Yiannis Kantaros, Bruno Sinopoli, Mingyu Cai

TL;DR

The paper tackles deadlocks in multi-robot navigation by combining decentralized RL-based navigation with on-demand, locally confined MAPF to resolve topological bottlenecks. A deadlock detector triggers short, cropped MAPF subproblems for only the implicated agents, solved via Push-and-Rotate and executed as dense waypoint sequences while all other agents continue with RL. This on-demand coordination preserves decentralized efficiency and scales to unseen environments, with formal guarantees and polynomial-time complexity for the local MAPF solver. Empirical results in doorway and corridor scenarios show dramatic improvements in task success, demonstrating robust zero-shot performance. The framework offers a practical, scalable approach to integrating learning-based control with classical planning for reliable multi-robot coordination.

Abstract

Multi-robot navigation in cluttered environments presents fundamental challenges in balancing reactive collision avoidance with long-range goal achievement. When navigating through narrow passages or confined spaces, deadlocks frequently emerge that prevent agents from reaching their destinations, particularly when Reinforcement Learning (RL) control policies encounter novel configurations out of learning distribution. Existing RL-based approaches suffer from limited generalization capability in unseen environments. We propose a hybrid framework that seamlessly integrates RL-based reactive navigation with on-demand Multi-Agent Path Finding (MAPF) to explicitly resolve topological deadlocks. Our approach integrates a safety layer that monitors agent progress to detect deadlocks and, when detected, triggers a coordination controller for affected agents. The framework constructs globally feasible trajectories via MAPF and regulates waypoint progression to reduce inter-agent conflicts during navigation. Extensive evaluation on dense multi-agent benchmarks shows that our method boosts task completion from marginal to near-universal success, markedly reducing deadlocks and collisions. When integrated with hierarchical task planning, it enables coordinated navigation for heterogeneous robots, demonstrating that coupling reactive RL navigation with selective MAPF intervention yields a robust, zero-shot performance.

Deadlock-Free Hybrid RL-MAPF Framework for Zero-Shot Multi-Robot Navigation

TL;DR

The paper tackles deadlocks in multi-robot navigation by combining decentralized RL-based navigation with on-demand, locally confined MAPF to resolve topological bottlenecks. A deadlock detector triggers short, cropped MAPF subproblems for only the implicated agents, solved via Push-and-Rotate and executed as dense waypoint sequences while all other agents continue with RL. This on-demand coordination preserves decentralized efficiency and scales to unseen environments, with formal guarantees and polynomial-time complexity for the local MAPF solver. Empirical results in doorway and corridor scenarios show dramatic improvements in task success, demonstrating robust zero-shot performance. The framework offers a practical, scalable approach to integrating learning-based control with classical planning for reliable multi-robot coordination.

Abstract

Multi-robot navigation in cluttered environments presents fundamental challenges in balancing reactive collision avoidance with long-range goal achievement. When navigating through narrow passages or confined spaces, deadlocks frequently emerge that prevent agents from reaching their destinations, particularly when Reinforcement Learning (RL) control policies encounter novel configurations out of learning distribution. Existing RL-based approaches suffer from limited generalization capability in unseen environments. We propose a hybrid framework that seamlessly integrates RL-based reactive navigation with on-demand Multi-Agent Path Finding (MAPF) to explicitly resolve topological deadlocks. Our approach integrates a safety layer that monitors agent progress to detect deadlocks and, when detected, triggers a coordination controller for affected agents. The framework constructs globally feasible trajectories via MAPF and regulates waypoint progression to reduce inter-agent conflicts during navigation. Extensive evaluation on dense multi-agent benchmarks shows that our method boosts task completion from marginal to near-universal success, markedly reducing deadlocks and collisions. When integrated with hierarchical task planning, it enables coordinated navigation for heterogeneous robots, demonstrating that coupling reactive RL navigation with selective MAPF intervention yields a robust, zero-shot performance.

Paper Structure

This paper contains 24 sections, 5 theorems, 7 equations, 3 figures, 2 tables, 1 algorithm.

Key Result

theorem 1

Assume that every connected component $C$ of $\mathcal{G}'$ contains at least two blanks ($b(C)\ge 2$) and that the induced instance $\mathcal{I}'$ is solvable. Then PnR returns $\Pi$; moreover, after tracking $\{\mathcal{W}^{\mathrm{dense}}_a\}$, every $a\in\mathcal{A}_{\mathrm{lc}}$ increases its

Figures (3)

  • Figure 1: Deadlock phenomenon in practice and simulation. Reciprocal avoidance can saturate at bottlenecks, leading to deadlocks (photo adapted from bui2023surveyMRMP).
  • Figure 2: Hybrid RL+MAPF workflow.
  • Figure 3: Qualitative comparison in Doorway (left two panels) and Corridor (right two panels).

Theorems & Definitions (10)

  • theorem 1: Finite-step deadlock clearance
  • proof
  • theorem 2: Polynomially bounded coordination overhead
  • proof
  • lemma 1: Local progress
  • proof
  • theorem 3: Completeness with two blanks per component
  • proof
  • theorem 4: Polynomial running time
  • proof