Table of Contents
Fetching ...

Optimized Directed Roadmap Graph for Multi-Agent Path Finding Using Stochastic Gradient Descent

Christian Henkel, Marc Toussaint

TL;DR

This work introduces the Optimized Directed Roadmap Graph (ODRM), a directed roadmap learned from C-space samples to minimize expected directed-path costs for multi-agent path finding. By optimizing vertex positions and edge directions with stochastic gradient descent (via ADAM) and using a relaxed directional cost D(d) with a sigmoid penalty, ODRM encodes environmental structure that reduces collisions and deadlocks. The approach yields emergent traffic-like patterns (e.g., edges parallel to walls, circular lanes) and is effective with both centralized and decentralized planners, outperforming undirected roadmaps and grid-based baselines in many scenarios. The results suggest ODRM offers a scalable, environment-aware precomputation that accelerates online planning for large fleets of agents in industrial settings and similar domains.

Abstract

We present a novel approach called Optimized Directed Roadmap Graph (ODRM). It is a method to build a directed roadmap graph that allows for collision avoidance in multi-robot navigation. This is a highly relevant problem, for example for industrial autonomous guided vehicles. The core idea of ODRM is, that a directed roadmap can encode inherent properties of the environment which are useful when agents have to avoid each other in that same environment. Like Probabilistic Roadmaps (PRMs), ODRM's first step is generating samples from C-space. In a second step, ODRM optimizes vertex positions and edge directions by Stochastic Gradient Descent (SGD). This leads to emergent properties like edges parallel to walls and patterns similar to two-lane streets or roundabouts. Agents can then navigate on this graph by searching their path independently and solving occurring agent-agent collisions at run-time. Using the graphs generated by ODRM compared to a non-optimized graph significantly fewer agent-agent collisions happen. We evaluate our roadmap with both, centralized and decentralized planners. Our experiments show that with ODRM even a simple centralized planner can solve problems with high numbers of agents that other multi-agent planners can not solve. Additionally, we use simulated robots with decentralized planners and online collision avoidance to show how agents are a lot faster on our roadmap than on standard grid maps.

Optimized Directed Roadmap Graph for Multi-Agent Path Finding Using Stochastic Gradient Descent

TL;DR

This work introduces the Optimized Directed Roadmap Graph (ODRM), a directed roadmap learned from C-space samples to minimize expected directed-path costs for multi-agent path finding. By optimizing vertex positions and edge directions with stochastic gradient descent (via ADAM) and using a relaxed directional cost D(d) with a sigmoid penalty, ODRM encodes environmental structure that reduces collisions and deadlocks. The approach yields emergent traffic-like patterns (e.g., edges parallel to walls, circular lanes) and is effective with both centralized and decentralized planners, outperforming undirected roadmaps and grid-based baselines in many scenarios. The results suggest ODRM offers a scalable, environment-aware precomputation that accelerates online planning for large fleets of agents in industrial settings and similar domains.

Abstract

We present a novel approach called Optimized Directed Roadmap Graph (ODRM). It is a method to build a directed roadmap graph that allows for collision avoidance in multi-robot navigation. This is a highly relevant problem, for example for industrial autonomous guided vehicles. The core idea of ODRM is, that a directed roadmap can encode inherent properties of the environment which are useful when agents have to avoid each other in that same environment. Like Probabilistic Roadmaps (PRMs), ODRM's first step is generating samples from C-space. In a second step, ODRM optimizes vertex positions and edge directions by Stochastic Gradient Descent (SGD). This leads to emergent properties like edges parallel to walls and patterns similar to two-lane streets or roundabouts. Agents can then navigate on this graph by searching their path independently and solving occurring agent-agent collisions at run-time. Using the graphs generated by ODRM compared to a non-optimized graph significantly fewer agent-agent collisions happen. We evaluate our roadmap with both, centralized and decentralized planners. Our experiments show that with ODRM even a simple centralized planner can solve problems with high numbers of agents that other multi-agent planners can not solve. Additionally, we use simulated robots with decentralized planners and online collision avoidance to show how agents are a lot faster on our roadmap than on standard grid maps.

Paper Structure

This paper contains 25 sections, 4 equations, 8 figures.

Figures (8)

  • Figure 1: Optimization progress of roadmap. Edges are shown in their current most likely direction. Red edges indicate a $d$ (see \ref{['eq:direction_cost']}) close to $0$ (i. e. an undecided edge) while green edges have a high confidence with a $d$ further away from $0$. The black line is one random path from the evaluation set. This uses Scenario O (\ref{['fig:scenario_o']})
  • Figure 2: Evaluation Scenario Maps. White areas indicate $C_{free}$, black obstacles and gray shows unknown areas.
  • Figure 3: The convergence of the Batch Cost Function over the training progress. The Batch Cost Function is the sum of all path costs \ref{['eq:cost']} within one batch.
  • Figure 4: Comparison of success rate, path duration and computation time over number of agents for different combinations of planners and graphs described in \ref{['ssec:eval_cen']}. Data is an average over 20 runs in Scenario Z \ref{['fig:scenario_z']} with 200 vertices.
  • Figure 5: Roadmap in Scenario O and X at the end of convergence after $2048$ batches of size $\alpha_B = 256$. The black line indicates a randomly selected path through the roadmap. Red edges indicate a $d$ close to $0$ i. e. an undecided edge while green edges have a high confidence.
  • ...and 3 more figures