Table of Contents
Fetching ...

Symmetry-Breaking in Multi-Agent Navigation: Winding Number-Aware MPC with a Learned Topological Strategy

Tomoki Nakao, Kazumi Kasaura, Tadashi Kozuno

TL;DR

This paper tackles symmetry-induced deadlocks in decentralized multi-agent navigation by introducing a winding-number-aware hierarchical MPC (WNumMPC). A learning-based Planner outputs target winding numbers $w_k^{i,j}$ and dynamic interaction weights $\alpha_{w,k}^{i,j}$ to steer a model-based MPC Controller that generates collision-free motions, combining flexible planning with reliable execution. The Planner is trained with PPO to learn topology-driven strategies, while the Controller optimizes a cost $\mathcal{J} = \alpha_g \mathcal{J}_g + \alpha_o \mathcal{J}_o + \mathcal{J}_w$, where $\mathcal{J}_w$ enforces target wind numbers, enabling effective symmetry breaking in dense environments. Across simulation and real-world tabletop experiments, WNumMPC outperforms baselines in terms of success rate and efficiency, demonstrating the practical viability of learned topological strategies for decentralized navigation; code is available at $https://github.com/omron-sinicx/WNumMPC$.

Abstract

We address the fundamental challenge of resolving symmetry-induced deadlocks in distributed multi-agent navigation by proposing a new hierarchical navigation method. When multiple agents interact, it is inherently difficult for them to autonomously break the symmetry of deciding how to pass each other. To tackle this problem, we introduce an approach that quantifies cooperative symmetry-breaking strategies using a topological invariant called the winding number, and learns the strategies themselves through reinforcement learning. Our method features a hierarchical policy consisting of a learning-based Planner, which plans topological cooperative strategies, and a model-based Controller, which executes them. Through reinforcement learning, the Planner learns to produce two types of parameters for the Controller: one is the topological cooperative strategy represented by winding numbers, and the other is a set of dynamic weights that determine which agent interaction to prioritize in dense scenarios where multiple agents cross simultaneously. The Controller then generates collision-free and efficient motions based on the strategy and weights provided by the Planner. This hierarchical structure combines the flexible decision-making ability of learning-based methods with the reliability of model-based approaches. Simulation and real-world robot experiments demonstrate that our method outperforms existing baselines, particularly in dense environments, by efficiently avoiding collisions and deadlocks while achieving superior navigation performance. The code for the experiments is available at https://github.com/omron-sinicx/WNumMPC.

Symmetry-Breaking in Multi-Agent Navigation: Winding Number-Aware MPC with a Learned Topological Strategy

TL;DR

This paper tackles symmetry-induced deadlocks in decentralized multi-agent navigation by introducing a winding-number-aware hierarchical MPC (WNumMPC). A learning-based Planner outputs target winding numbers and dynamic interaction weights to steer a model-based MPC Controller that generates collision-free motions, combining flexible planning with reliable execution. The Planner is trained with PPO to learn topology-driven strategies, while the Controller optimizes a cost , where enforces target wind numbers, enabling effective symmetry breaking in dense environments. Across simulation and real-world tabletop experiments, WNumMPC outperforms baselines in terms of success rate and efficiency, demonstrating the practical viability of learned topological strategies for decentralized navigation; code is available at .

Abstract

We address the fundamental challenge of resolving symmetry-induced deadlocks in distributed multi-agent navigation by proposing a new hierarchical navigation method. When multiple agents interact, it is inherently difficult for them to autonomously break the symmetry of deciding how to pass each other. To tackle this problem, we introduce an approach that quantifies cooperative symmetry-breaking strategies using a topological invariant called the winding number, and learns the strategies themselves through reinforcement learning. Our method features a hierarchical policy consisting of a learning-based Planner, which plans topological cooperative strategies, and a model-based Controller, which executes them. Through reinforcement learning, the Planner learns to produce two types of parameters for the Controller: one is the topological cooperative strategy represented by winding numbers, and the other is a set of dynamic weights that determine which agent interaction to prioritize in dense scenarios where multiple agents cross simultaneously. The Controller then generates collision-free and efficient motions based on the strategy and weights provided by the Planner. This hierarchical structure combines the flexible decision-making ability of learning-based methods with the reliability of model-based approaches. Simulation and real-world robot experiments demonstrate that our method outperforms existing baselines, particularly in dense environments, by efficiently avoiding collisions and deadlocks while achieving superior navigation performance. The code for the experiments is available at https://github.com/omron-sinicx/WNumMPC.

Paper Structure

This paper contains 22 sections, 14 equations, 6 figures, 1 algorithm.

Figures (6)

  • Figure 1: Seven small two-wheeled robots (“maru” ichihashi2024swarm) moving on a tabletop. By cooperatively breaking positional symmetry, the system achieves efficient navigation.
  • Figure 2: The proposed hierarchical architecture for cooperative symmetry breaking. The learning-based Planner generates a high-level topological strategy to break symmetry, which the model-based Controller then executes reliably.
  • Figure 3: Comparison of agent trajectories in a Crossing scenario, generated by (a) ORCA, (b) CADRL, (c) Vanilla MPC, (d) T-MPC, and (e) WNumMPC. The proposed method (e) efficiently resolves the navigational symmetry, unlike baseline methods that result in deadlocks (a), collisions (b), and inefficient paths (c, d).
  • Figure 4: Comparison of Navigation Success Rates (solid bars) and Timeout Rates (hatched bars). (Left) Results in simulations of the holonomic model for each agent count ($N$) and generation method of instances (Random, Crossing). (Right) Results of the MPC-based methods with $N=7$ differential wheeled robots in simulations and real-world experiments.
  • Figure 5: Comparison of Navigation Efficiency Based on Average Extra Time to Goal. (Left) Average extra time in Holonomic model simulations for each agent count (N) and generation method of instances (Random, Crossing). (Right) Average extra time for MPC-based methods with $N=7$ differential wheeled robots in simulations and real-world experiments.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1: Winding Number