Table of Contents
Fetching ...

GRAND: Guidance, Rebalancing, and Assignment for Networked Dispatch in Multi-Agent Path Finding

Johannes Gaber, Meshal Alharbi, Daniele Gammelli, Gioele Zardini

TL;DR

This paper tackles high-throughput, real-time lifelong MAPD scheduling for large robot fleets by marrying graph-based global guidance with lightweight optimization. A learned guidance policy produces region-level agent-distribution targets, which are efficiently realized through a region-to-region minimum-cost flow rebalancing followed by decoupled local task assignments. The method yields up to 10% throughput improvements on standardized warehouse benchmarks while maintaining a 1-second per-step compute budget and demonstrates zero-shot transfer across map sizes and occupancy. The work highlights the practicality of graph neural guidance combined with tractable solvers for scalable, high-throughput fleet management in industrial settings.

Abstract

Large robot fleets are now common in warehouses and other logistics settings, where small control gains translate into large operational impacts. In this article, we address task scheduling for lifelong Multi-Agent Pickup-and-Delivery (MAPD) and propose a hybrid method that couples learning-based global guidance with lightweight optimization. A graph neural network policy trained via reinforcement learning outputs a desired distribution of free agents over an aggregated warehouse graph. This signal is converted into region-to-region rebalancing through a minimum-cost flow, and finalized by small, local assignment problems, preserving accuracy while keeping per-step latency within a 1 s compute budget. On congested warehouse benchmarks from the League of Robot Runners (LRR) with up to 500 agents, our approach improves throughput by up to 10% over the 2024 winning scheduler while maintaining real-time execution. The results indicate that coupling graph-structured learned guidance with tractable solvers reduces congestion and yields a practical, scalable blueprint for high-throughput scheduling in large fleets.

GRAND: Guidance, Rebalancing, and Assignment for Networked Dispatch in Multi-Agent Path Finding

TL;DR

This paper tackles high-throughput, real-time lifelong MAPD scheduling for large robot fleets by marrying graph-based global guidance with lightweight optimization. A learned guidance policy produces region-level agent-distribution targets, which are efficiently realized through a region-to-region minimum-cost flow rebalancing followed by decoupled local task assignments. The method yields up to 10% throughput improvements on standardized warehouse benchmarks while maintaining a 1-second per-step compute budget and demonstrates zero-shot transfer across map sizes and occupancy. The work highlights the practicality of graph neural guidance combined with tractable solvers for scalable, high-throughput fleet management in industrial settings.

Abstract

Large robot fleets are now common in warehouses and other logistics settings, where small control gains translate into large operational impacts. In this article, we address task scheduling for lifelong Multi-Agent Pickup-and-Delivery (MAPD) and propose a hybrid method that couples learning-based global guidance with lightweight optimization. A graph neural network policy trained via reinforcement learning outputs a desired distribution of free agents over an aggregated warehouse graph. This signal is converted into region-to-region rebalancing through a minimum-cost flow, and finalized by small, local assignment problems, preserving accuracy while keeping per-step latency within a 1 s compute budget. On congested warehouse benchmarks from the League of Robot Runners (LRR) with up to 500 agents, our approach improves throughput by up to 10% over the 2024 winning scheduler while maintaining real-time execution. The results indicate that coupling graph-structured learned guidance with tractable solvers reduces congestion and yields a practical, scalable blueprint for high-throughput scheduling in large fleets.

Paper Structure

This paper contains 27 sections, 19 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Overview of our hierarchical task scheduling approach: (I) a data-driven layer provides global, macroscopic guidance in the form of a desired agent distribution; (II) a region-to-region optimal transport rebalances free agents toward the desired distribution; and (III) local, decoupled matching problems produce the final assignments for task scheduling. All symbols in the figure are defined later in the text.
  • Figure 2: An example of our graph aggregation for warehouse layouts.
  • Figure 3: Throughput for varying numbers of agents $|A|$ and map sizes $|V_\mathrm{tile}|$.
  • Figure 4: Heat map of the number of total conflicts with $|A|=200$ and $|V_\mathrm{tile}|=975$. The darkest color represents zero conflicts, while the brightest color represents 3000 conflicts.
  • Figure 5: Average initial and lifelong per-step computation times.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Definition IV.1: Task generator
  • Definition IV.2: Goal map
  • Definition IV.3: Scheduling policy
  • Definition IV.4: Planning policy
  • Definition IV.5: Throughput
  • Remark V.1: Feasibility and interpretation