Table of Contents
Fetching ...

LaMMA-P: Generalizable Multi-Agent Long-Horizon Task Allocation and Planning with LM-Driven PDDL Planner

Xiaopan Zhang, Hao Qin, Fuquan Wang, Yue Dong, Jiachen Li

TL;DR

LaMMA-P addresses long-horizon, heterogeneous multi-agent task planning by fusing the reasoning strengths of large language models with the structured planning of PDDL and the Fast Downward planner. The framework decomposes human instructions into sub-tasks, allocates them to capable robots, generates and validates PDDL problems, and merges sub-plans into a coherent execution strategy, with a modular design that scales to any number of agents. It introduces MAT-THOR, a challenging AI2-THOR-based benchmark for multi-agent long-horizon tasks, and demonstrates state-of-the-art results, notably a 105% increase in success rate and a 36% improvement in efficiency over strong baselines. The work advances practical, generalizable multi-robot coordination in household-like settings and provides a reusable platform for benchmarking long-horizon planning methods.

Abstract

Language models (LMs) possess a strong capability to comprehend natural language, making them effective in translating human instructions into detailed plans for simple robot tasks. Nevertheless, it remains a significant challenge to handle long-horizon tasks, especially in subtask identification and allocation for cooperative heterogeneous robot teams. To address this issue, we propose a Language Model-Driven Multi-Agent PDDL Planner (LaMMA-P), a novel multi-agent task planning framework that achieves state-of-the-art performance on long-horizon tasks. LaMMA-P integrates the strengths of the LMs' reasoning capability and the traditional heuristic search planner to achieve a high success rate and efficiency while demonstrating strong generalization across tasks. Additionally, we create MAT-THOR, a comprehensive benchmark that features household tasks with two different levels of complexity based on the AI2-THOR environment. The experimental results demonstrate that LaMMA-P achieves a 105% higher success rate and 36% higher efficiency than existing LM-based multiagent planners. The experimental videos, code, datasets, and detailed prompts used in each module can be found on the project website: https://lamma-p.github.io.

LaMMA-P: Generalizable Multi-Agent Long-Horizon Task Allocation and Planning with LM-Driven PDDL Planner

TL;DR

LaMMA-P addresses long-horizon, heterogeneous multi-agent task planning by fusing the reasoning strengths of large language models with the structured planning of PDDL and the Fast Downward planner. The framework decomposes human instructions into sub-tasks, allocates them to capable robots, generates and validates PDDL problems, and merges sub-plans into a coherent execution strategy, with a modular design that scales to any number of agents. It introduces MAT-THOR, a challenging AI2-THOR-based benchmark for multi-agent long-horizon tasks, and demonstrates state-of-the-art results, notably a 105% increase in success rate and a 36% improvement in efficiency over strong baselines. The work advances practical, generalizable multi-robot coordination in household-like settings and provides a reusable platform for benchmarking long-horizon planning methods.

Abstract

Language models (LMs) possess a strong capability to comprehend natural language, making them effective in translating human instructions into detailed plans for simple robot tasks. Nevertheless, it remains a significant challenge to handle long-horizon tasks, especially in subtask identification and allocation for cooperative heterogeneous robot teams. To address this issue, we propose a Language Model-Driven Multi-Agent PDDL Planner (LaMMA-P), a novel multi-agent task planning framework that achieves state-of-the-art performance on long-horizon tasks. LaMMA-P integrates the strengths of the LMs' reasoning capability and the traditional heuristic search planner to achieve a high success rate and efficiency while demonstrating strong generalization across tasks. Additionally, we create MAT-THOR, a comprehensive benchmark that features household tasks with two different levels of complexity based on the AI2-THOR environment. The experimental results demonstrate that LaMMA-P achieves a 105% higher success rate and 36% higher efficiency than existing LM-based multiagent planners. The experimental videos, code, datasets, and detailed prompts used in each module can be found on the project website: https://lamma-p.github.io.
Paper Structure (14 sections, 3 equations, 3 figures, 2 tables)

This paper contains 14 sections, 3 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: A typical multi-agent long-horizon task in a household scenario where Robot 1 and Robot 2 collaborate to execute tasks based on human commands given in natural language.
  • Figure 2: An overview of LaMMA-P’s modular architecture. Our framework leverages LMs within six key modules: Precondition Identifier (P), Task Allocator, Problem Generator (G), Fast Downward/LLM Planner, PDDL Validator (V), and Sub-Plan Combiner, each serving a specific role in task execution. The Precondition Identifier analyzes the initial conditions and requirements for task completion. The Task Allocator assigns subtasks to agents based on their skill sets and task complexity. The Fast Downward/LLM Planner converts task descriptions into executable plans for each agent. Finally, the Sub-Plan Combiner integrates individual agent plans into a cohesive execution strategy, ensuring synchronized actions to achieve the overall task objectives.
  • Figure 3: The keyframes depict two tasks in different AI2-THOR rooms, highlighting key execution phases. Each row represents a distinct task, with three robots in yellow boxes labeled by number. Objects to be manipulated appear in blue boxes in the prior frame. The first task involves turning off the lights, placing a phone on the bed, and leaving a book open. The second requires placing keys and a watch in a drawer and turning off the lamp and laptop.