Online Guidance Graph Optimization for Lifelong Multi-Agent Path Finding

Hongzhi Zang; Yulun Zhang; He Jiang; Zhe Chen; Daniel Harabor; Peter J. Stuckey; Jiaoyang Li

Online Guidance Graph Optimization for Lifelong Multi-Agent Path Finding

Hongzhi Zang, Yulun Zhang, He Jiang, Zhe Chen, Daniel Harabor, Peter J. Stuckey, Jiaoyang Li

TL;DR

This work tackles lifelong MAPF by learning an online guidance policy that updates a guidance graph to adapt edge costs in real time, aiming to improve throughput for PIBT-based LMAPF. It introduces two integration pipelines—the Direct Planning and Guide-Path Planning approaches—where the policy $oldsymbol{ heta}$ maps observations to edge weights on $G_g(V_g, E_g, oldsymbol{ omega})$, and optimizes $oldsymbol{ heta}$ with CMA-ES using LMAPF simulators. Empirical results across multiple maps and dynamic task distributions show that online guidance outperforms offline guidance and human-designed online policies, with throughput improvements up to 30.75% over offline and up to 52.42% over handcrafted policies in certain settings; LNS further enhances performance at modest runtime costs. The findings highlight the practical potential of dynamic guidance in large-scale robotic systems and suggest directions for broader application and efficiency improvements in online policy optimization.

Abstract

We study the problem of optimizing a guidance policy capable of dynamically guiding the agents for lifelong Multi-Agent Path Finding based on real-time traffic patterns. Multi-Agent Path Finding (MAPF) focuses on moving multiple agents from their starts to goals without collisions. Its lifelong variant, LMAPF, continuously assigns new goals to agents. In this work, we focus on improving the solution quality of PIBT, a state-of-the-art rule-based LMAPF algorithm, by optimizing a policy to generate adaptive guidance. We design two pipelines to incorporate guidance in PIBT in two different ways. We demonstrate the superiority of the optimized policy over both static guidance and human-designed policies. Additionally, we explore scenarios where task distribution changes over time, a challenging yet common situation in real-world applications that is rarely explored in the literature.

Online Guidance Graph Optimization for Lifelong Multi-Agent Path Finding

TL;DR

Abstract

Online Guidance Graph Optimization for Lifelong Multi-Agent Path Finding

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)

Theorems & Definitions (3)