Table of Contents
Fetching ...

Scale-Plan: Scalable Language-Enabled Task Planning for Heterogeneous Multi-Robot Teams

Piyush Gupta, Sangjae Bae, Jiachen Li, David Isele

TL;DR

Scale-Plan is presented, a scalable LLM-assisted framework that generates compact, task-relevant problem representations from natural language instructions that outperforms pure LLM and hybrid LLM-PDDL baselines across all metrics, improving scalability and reliability.

Abstract

Long-horizon task planning for heterogeneous multi-robot systems is essential for deploying collaborative teams in real-world environments; yet, it remains challenging due to the large volume of perceptual information, much of which is irrelevant to task objectives and burdens planning. Traditional symbolic planners rely on manually constructed problem specifications, limiting scalability and adaptability, while recent large language model (LLM)-based approaches often suffer from hallucinations and weak grounding-i.e., poor alignment between generated plans and actual environmental objects and constraints-in object-rich settings. We present Scale-Plan, a scalable LLM-assisted framework that generates compact, task-relevant problem representations from natural language instructions. Given a PDDL domain specification, Scale-Plan constructs an action graph capturing domain structure and uses shallow LLM reasoning to guide a structured graph search that identifies a minimal subset of relevant actions and objects. By filtering irrelevant information prior to planning, Scale-Plan enables efficient decomposition, allocation, and long-horizon plan generation. We evaluate our approach on complex multi-agent tasks and introduce MAT2-THOR, a cleaned benchmark built on AI2-THOR for reliable evaluation of multi-robot planning systems. Scale-Plan outperforms pure LLM and hybrid LLM-PDDL baselines across all metrics, improving scalability and reliability.

Scale-Plan: Scalable Language-Enabled Task Planning for Heterogeneous Multi-Robot Teams

TL;DR

Scale-Plan is presented, a scalable LLM-assisted framework that generates compact, task-relevant problem representations from natural language instructions that outperforms pure LLM and hybrid LLM-PDDL baselines across all metrics, improving scalability and reliability.

Abstract

Long-horizon task planning for heterogeneous multi-robot systems is essential for deploying collaborative teams in real-world environments; yet, it remains challenging due to the large volume of perceptual information, much of which is irrelevant to task objectives and burdens planning. Traditional symbolic planners rely on manually constructed problem specifications, limiting scalability and adaptability, while recent large language model (LLM)-based approaches often suffer from hallucinations and weak grounding-i.e., poor alignment between generated plans and actual environmental objects and constraints-in object-rich settings. We present Scale-Plan, a scalable LLM-assisted framework that generates compact, task-relevant problem representations from natural language instructions. Given a PDDL domain specification, Scale-Plan constructs an action graph capturing domain structure and uses shallow LLM reasoning to guide a structured graph search that identifies a minimal subset of relevant actions and objects. By filtering irrelevant information prior to planning, Scale-Plan enables efficient decomposition, allocation, and long-horizon plan generation. We evaluate our approach on complex multi-agent tasks and introduce MAT2-THOR, a cleaned benchmark built on AI2-THOR for reliable evaluation of multi-robot planning systems. Scale-Plan outperforms pure LLM and hybrid LLM-PDDL baselines across all metrics, improving scalability and reliability.
Paper Structure (16 sections, 2 equations, 6 figures, 4 tables)

This paper contains 16 sections, 2 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Task-relevant environment filtering for scalable planning. Naive long-horizon planning considers all detected objects and robot capabilities, leading to large action spaces and planning errors (left). Scale-Plan extracts only task-relevant scene information and skills (right), significantly reducing combinatorial complexity and enabling efficient multi-robot plan synthesis.
  • Figure 2: Minimal STRIPS-style PDDL domain example used to illustrate predicates, action preconditions, and effects. Scale-Plan constructs its action graph directly from such domain specifications.
  • Figure 3: Example PDDL problem instance. The problem specifies objects, initial state, and goal conditions for a particular task.
  • Figure 4: Overview of the Scale-Plan framework. Scale-Plan identifies task-relevant environment information by searching an action graph constructed from the PDDL domain. The filtered information is then used by an LLM-based planning pipeline to perform structured task decomposition, task allocation, and plan integration. The resulting high-level plan is translated into an AI2-THOR–executable plan by Plan-to-Code and executed in the simulator.
  • Figure 5: Edge generation in the action graph. A strict edge $a_1 \rightarrow a_2$ is added if $PRE(a_2) \subseteq EFF(a_1)$. A relaxed edge is added when $PRE(a_2) \cap EFF(a_1) \neq \emptyset$ and strict connectivity is absent, ensuring graph connectivity without over-densification.
  • ...and 1 more figures