Table of Contents
Fetching ...

ELHPlan: Efficient Long-Horizon Task Planning for Multi-Agent Collaboration

Shaobin Ling, Yun Wang, Chenyou Fan, Tin Lun Lam, Junjie Hu

TL;DR

This paper proposes Efficient Long-Horizon Planning (ELHPlan), a novel framework that introduces Action Chains, sequences of actions explicitly bound to sub-goal intentions, as the fundamental planning primitive and establishes a new efficiency-effectiveness frontier for LLM-based multi-agent planning systems.

Abstract

Large Language Models (LLMs) enable intelligent multi-robot collaboration but face fundamental trade-offs: open-loop methods that compile tasks into formal representations for external executors produce sound plans but lack adaptability in partially observable environments, while iterative methods incur prohibitive computational costs that scale poorly with team size and task complexity. In this paper, we propose Efficient Long-Horizon Planning (ELHPlan), a novel framework that introduces Action Chains, sequences of actions explicitly bound to sub-goal intentions, as the fundamental planning primitive. ELHPlan operates via a cyclical process: 1) constructing intention-bound action sequences, 2) proactively validating for conflicts and feasibility, 3) refining issues through targeted mechanisms, and 4) executing validated actions. This design balances adaptability and efficiency by providing intention-bound action sequences with longer lookahead while avoiding expensive full re-planning. We further advocate comprehensive efficiency metrics, including token consumption and planning time, to more holistically evaluate multi-agent collaboration. Our experiments on benchmarks TDW-MAT and C-WAH demonstrate that ELHPlan achieves comparable task success rates while consuming only 30-40% of the tokens required by state-of-the-art methods. Our research establishes a new efficiency-effectiveness frontier for LLM-based multi-agent planning systems.

ELHPlan: Efficient Long-Horizon Task Planning for Multi-Agent Collaboration

TL;DR

This paper proposes Efficient Long-Horizon Planning (ELHPlan), a novel framework that introduces Action Chains, sequences of actions explicitly bound to sub-goal intentions, as the fundamental planning primitive and establishes a new efficiency-effectiveness frontier for LLM-based multi-agent planning systems.

Abstract

Large Language Models (LLMs) enable intelligent multi-robot collaboration but face fundamental trade-offs: open-loop methods that compile tasks into formal representations for external executors produce sound plans but lack adaptability in partially observable environments, while iterative methods incur prohibitive computational costs that scale poorly with team size and task complexity. In this paper, we propose Efficient Long-Horizon Planning (ELHPlan), a novel framework that introduces Action Chains, sequences of actions explicitly bound to sub-goal intentions, as the fundamental planning primitive. ELHPlan operates via a cyclical process: 1) constructing intention-bound action sequences, 2) proactively validating for conflicts and feasibility, 3) refining issues through targeted mechanisms, and 4) executing validated actions. This design balances adaptability and efficiency by providing intention-bound action sequences with longer lookahead while avoiding expensive full re-planning. We further advocate comprehensive efficiency metrics, including token consumption and planning time, to more holistically evaluate multi-agent collaboration. Our experiments on benchmarks TDW-MAT and C-WAH demonstrate that ELHPlan achieves comparable task success rates while consuming only 30-40% of the tokens required by state-of-the-art methods. Our research establishes a new efficiency-effectiveness frontier for LLM-based multi-agent planning systems.

Paper Structure

This paper contains 14 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Trajectories and explored maps of CoELA zhang2024building (a) and our approach (b) in the same task. The blue trajectory represents agent 1's path. The red trajectory represents agent 2's path. The trajectories illustrate less redundant overlap under comparable token budgets (22,882 vs. 23,042 tokens), highlighting improved coordination efficiency.
  • Figure 2: An example of an Action Chain presented in JSON-like format, containing the action sequence and intentions.
  • Figure 3: The framework consists of a three-stage workflow (Memory, Planning, and Validation-Refinement) and corresponding processing flows. Each numbered node represents an action, and the number indicates the planning iteration that generated it: nodes labeled 1 were produced by the first planning call, nodes labeled 2 by the second, and so on. Nodes sharing the same number collectively form one Action Chain. In the refinement stage, we assume that agents' plans encounter different cases of refinement at varying stages of action execution progress, thereby demonstrating that distinct refinement methods exert differential impacts on the original plan.
  • Figure 4: Illustrative example of the ELHPlan during the Construction and Validation–Refinement stages. Initial task allocations are generated and assigned to two agents. Agent 2 performs a Chain Insertion triggered by action 'replan', while Agent 1 subsequently refines its action chain to address an infeasible action. Finally, the conflict arising from both agents attempting to 'grasp apple (23)' is resolved. The yellow highlight denotes the action scheduled for execution, whereas the green highlight denotes the action currently in execution. Identical items and room names are distinguished by numerical labels.
  • Figure 5: Performance when varying the number of robots using GPT‑4o as the LLM backbone.
  • ...and 2 more figures