Optimized Directed Roadmap Graph for Multi-Agent Path Finding Using Stochastic Gradient Descent
Christian Henkel, Marc Toussaint
TL;DR
This work introduces the Optimized Directed Roadmap Graph (ODRM), a directed roadmap learned from C-space samples to minimize expected directed-path costs for multi-agent path finding. By optimizing vertex positions and edge directions with stochastic gradient descent (via ADAM) and using a relaxed directional cost D(d) with a sigmoid penalty, ODRM encodes environmental structure that reduces collisions and deadlocks. The approach yields emergent traffic-like patterns (e.g., edges parallel to walls, circular lanes) and is effective with both centralized and decentralized planners, outperforming undirected roadmaps and grid-based baselines in many scenarios. The results suggest ODRM offers a scalable, environment-aware precomputation that accelerates online planning for large fleets of agents in industrial settings and similar domains.
Abstract
We present a novel approach called Optimized Directed Roadmap Graph (ODRM). It is a method to build a directed roadmap graph that allows for collision avoidance in multi-robot navigation. This is a highly relevant problem, for example for industrial autonomous guided vehicles. The core idea of ODRM is, that a directed roadmap can encode inherent properties of the environment which are useful when agents have to avoid each other in that same environment. Like Probabilistic Roadmaps (PRMs), ODRM's first step is generating samples from C-space. In a second step, ODRM optimizes vertex positions and edge directions by Stochastic Gradient Descent (SGD). This leads to emergent properties like edges parallel to walls and patterns similar to two-lane streets or roundabouts. Agents can then navigate on this graph by searching their path independently and solving occurring agent-agent collisions at run-time. Using the graphs generated by ODRM compared to a non-optimized graph significantly fewer agent-agent collisions happen. We evaluate our roadmap with both, centralized and decentralized planners. Our experiments show that with ODRM even a simple centralized planner can solve problems with high numbers of agents that other multi-agent planners can not solve. Additionally, we use simulated robots with decentralized planners and online collision avoidance to show how agents are a lot faster on our roadmap than on standard grid maps.
