Table of Contents
Fetching ...

Why Solving Multi-agent Path Finding with Large Language Model has not Succeeded Yet

Weizhe Chen, Sven Koenig, Bistra Dilkina

TL;DR

This work critically assesses the viability of solving multi-agent path finding (MAPF) with large language models (LLMs) without task-specific fine-tuning. It presents a prompt-driven workflow where an LLM provides per-step actions for agents, validated by a rule-based conflict checker, and iterates until a collision-free plan emerges. The experiments show LLMs can succeed on simple, obstacle-free maps but struggle on room- and maze-like benchmarks as map complexity and agent count grow, with failures attributed to three core factors: limited reasoning capability, restricted context length, and difficulties grounding obstacle locations. The study highlights concrete directions for future progress, including longer context support, better grounding of map information, and potential tool-assisted planning to bridge the gap between LLM reasoning and MAPF requirements. The findings offer a structured roadmap for researchers from different backgrounds to advance foundation-model-based MAPF research.

Abstract

With the explosive influence caused by the success of large language models (LLM) like ChatGPT and GPT-4, there has been an extensive amount of recent work showing that foundation models can be used to solve a large variety of tasks. However, there is very limited work that shares insights on multi-agent planning. Multi-agent planning is different from other domains by combining the difficulty of multi-agent coordination and planning, and making it hard to leverage external tools to facilitate the reasoning needed. In this paper, we focus on the problem of multi-agent path finding (MAPF), which is also known as multi-robot route planning, and study the performance of solving MAPF with LLMs. We first show the motivating success on an empty room map without obstacles, then the failure to plan on the harder room map and maze map of the standard MAPF benchmark. We present our position on why directly solving MAPF with LLMs has not been successful yet, and we use various experiments to support our hypothesis. Based on our results, we discussed how researchers with different backgrounds could help with this problem from different perspectives.

Why Solving Multi-agent Path Finding with Large Language Model has not Succeeded Yet

TL;DR

This work critically assesses the viability of solving multi-agent path finding (MAPF) with large language models (LLMs) without task-specific fine-tuning. It presents a prompt-driven workflow where an LLM provides per-step actions for agents, validated by a rule-based conflict checker, and iterates until a collision-free plan emerges. The experiments show LLMs can succeed on simple, obstacle-free maps but struggle on room- and maze-like benchmarks as map complexity and agent count grow, with failures attributed to three core factors: limited reasoning capability, restricted context length, and difficulties grounding obstacle locations. The study highlights concrete directions for future progress, including longer context support, better grounding of map information, and potential tool-assisted planning to bridge the gap between LLM reasoning and MAPF requirements. The findings offer a structured roadmap for researchers from different backgrounds to advance foundation-model-based MAPF research.

Abstract

With the explosive influence caused by the success of large language models (LLM) like ChatGPT and GPT-4, there has been an extensive amount of recent work showing that foundation models can be used to solve a large variety of tasks. However, there is very limited work that shares insights on multi-agent planning. Multi-agent planning is different from other domains by combining the difficulty of multi-agent coordination and planning, and making it hard to leverage external tools to facilitate the reasoning needed. In this paper, we focus on the problem of multi-agent path finding (MAPF), which is also known as multi-robot route planning, and study the performance of solving MAPF with LLMs. We first show the motivating success on an empty room map without obstacles, then the failure to plan on the harder room map and maze map of the standard MAPF benchmark. We present our position on why directly solving MAPF with LLMs has not been successful yet, and we use various experiments to support our hypothesis. Based on our results, we discussed how researchers with different backgrounds could help with this problem from different perspectives.
Paper Structure (20 sections, 9 figures, 3 tables)

This paper contains 20 sections, 9 figures, 3 tables.

Figures (9)

  • Figure 1: An illustration of our workflow.
  • Figure 2: An example of the user prompt for describing the scenario. Text in blue is a scenario-specific prompt, while text in orange is a map-specific prompt. In the experiments on the empty map, only the first blue paragraph will be provided, and all text starting from the black paragraph is removed because there are no obstacles. The text in purple is the single-step observation (SSO) information.
  • Figure 3: An example of the user prompt starting from the second step. While here we demonstrate a few options, only one of them, i.e., text in one color, will be provided to the LLM in one iteration.
  • Figure 4: The room-32-32-4 map (left) and the maze-32-32-2 map (right). The picture is vertically flipped to match the common knowledge that higher positions on the vertical axis indicate greater values.
  • Figure 5: Two examples of symmetry breaking examples, originally from li2019symmetry. In \ref{['fig:sym_breaking_1']}, every pair of shortest path will collide with each other. In \ref{['fig:sym_breaking_2']}, every pair of shortest path will collide with each other in a fixed cell (1,2).
  • ...and 4 more figures