Table of Contents
Fetching ...

CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation

Jie Liu, Pan Zhou, Yingjun Du, Ah-Hwee Tan, Cees G. M. Snoek, Jan-Jakob Sonke, Efstratios Gavves

TL;DR

CaPo addresses the challenge of coordinating large language model–driven embodied agents on long-horizon tasks by introducing a two-phase cooperative planning framework: (1) meta-plan generation, where agents collaboratively produce a long-term, coherent task decomposition, and (2) progress-adaptive meta-plan execution, where the plan is iteratively updated in response to new progress through multi-turn discussions. The meta-plan is generated via a designated designer and evaluator agents and refined through iterative prompts, while a progress-adaptive module updates the plan as discoveries or subtasks arise. Experimental results on TDW-MAT and C-WAH show CaPo achieves higher task completion rates and efficiency than state-of-the-art baselines, including CoELA, ProAgent, and RoCo, across multiple LLMs and perception settings. The work demonstrates that structured, long-horizon planning combined with progress-driven adaptation markedly improves cooperative behavior in embodied multi-agent systems, with practical implications for complex, collaborative tasks in dynamic environments.

Abstract

In this work, we address the cooperation problem among large language model (LLM) based embodied agents, where agents must cooperate to achieve a common goal. Previous methods often execute actions extemporaneously and incoherently, without long-term strategic and cooperative planning, leading to redundant steps, failures, and even serious repercussions in complex tasks like search-and-rescue missions where discussion and cooperative plan are crucial. To solve this issue, we propose Cooperative Plan Optimization (CaPo) to enhance the cooperation efficiency of LLM-based embodied agents. Inspired by human cooperation schemes, CaPo improves cooperation efficiency with two phases: 1) meta-plan generation, and 2) progress-adaptive meta-plan and execution. In the first phase, all agents analyze the task, discuss, and cooperatively create a meta-plan that decomposes the task into subtasks with detailed steps, ensuring a long-term strategic and coherent plan for efficient coordination. In the second phase, agents execute tasks according to the meta-plan and dynamically adjust it based on their latest progress (e.g., discovering a target object) through multi-turn discussions. This progress-based adaptation eliminates redundant actions, improving the overall cooperation efficiency of agents. Experimental results on the ThreeDworld Multi-Agent Transport and Communicative Watch-And-Help tasks demonstrate that CaPo achieves much higher task completion rate and efficiency compared with state-of-the-arts.The code is released at https://github.com/jliu4ai/CaPo.

CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation

TL;DR

CaPo addresses the challenge of coordinating large language model–driven embodied agents on long-horizon tasks by introducing a two-phase cooperative planning framework: (1) meta-plan generation, where agents collaboratively produce a long-term, coherent task decomposition, and (2) progress-adaptive meta-plan execution, where the plan is iteratively updated in response to new progress through multi-turn discussions. The meta-plan is generated via a designated designer and evaluator agents and refined through iterative prompts, while a progress-adaptive module updates the plan as discoveries or subtasks arise. Experimental results on TDW-MAT and C-WAH show CaPo achieves higher task completion rates and efficiency than state-of-the-art baselines, including CoELA, ProAgent, and RoCo, across multiple LLMs and perception settings. The work demonstrates that structured, long-horizon planning combined with progress-driven adaptation markedly improves cooperative behavior in embodied multi-agent systems, with practical implications for complex, collaborative tasks in dynamic environments.

Abstract

In this work, we address the cooperation problem among large language model (LLM) based embodied agents, where agents must cooperate to achieve a common goal. Previous methods often execute actions extemporaneously and incoherently, without long-term strategic and cooperative planning, leading to redundant steps, failures, and even serious repercussions in complex tasks like search-and-rescue missions where discussion and cooperative plan are crucial. To solve this issue, we propose Cooperative Plan Optimization (CaPo) to enhance the cooperation efficiency of LLM-based embodied agents. Inspired by human cooperation schemes, CaPo improves cooperation efficiency with two phases: 1) meta-plan generation, and 2) progress-adaptive meta-plan and execution. In the first phase, all agents analyze the task, discuss, and cooperatively create a meta-plan that decomposes the task into subtasks with detailed steps, ensuring a long-term strategic and coherent plan for efficient coordination. In the second phase, agents execute tasks according to the meta-plan and dynamically adjust it based on their latest progress (e.g., discovering a target object) through multi-turn discussions. This progress-based adaptation eliminates redundant actions, improving the overall cooperation efficiency of agents. Experimental results on the ThreeDworld Multi-Agent Transport and Communicative Watch-And-Help tasks demonstrate that CaPo achieves much higher task completion rate and efficiency compared with state-of-the-arts.The code is released at https://github.com/jliu4ai/CaPo.

Paper Structure

This paper contains 29 sections, 13 figures, 11 tables.

Figures (13)

  • Figure 1: Examples of task accomplishment of CoELA zhang2023building and our CaPo. In CoELA, after each action execution, Alice and Bob communicate to decide next action which is a greedy single-step plan and suboptimal. For example, they do not use wood basket which can contain multiple objects, and both extemporaneously move a single item to the target bed without a long-term and collaborative plan. In contrast, in CaPo, Alice and Bob first discuss to make a long-term meta-plan for strategical cooperation in which Alice is arranged to move several target items into a wood basket, and Bob moves the remaining target items and also searches the unknown objects. Then during execution phase, both follow the meta-plan to accomplish task, and dynamically adapt the meta-plan to latest task progress, ensuring its effectiveness and efficiency in coordinating agents.
  • Figure 2: Overview of the CooperAtive Plan Optimization (CaPo) framework for embodied multi-agent cooperation. CaPo consists of two key phases: 1) meta-plan Generation: All agents collaboratively formulate a meta-plan before taking any actions through multi-turn discussions. One agent serves as meta-plan designer, responsible for creating the meta-plan, while all other agents serve as meta-plan evaluators, providing critical feedback about meta-plan. 2) Progressive-adaptive meta-plan and Execution: As new progress is made, agents adopt a progress-adaptive planning module to adapt the meta-plan to the latest task progress, ensuring the effectiveness of meta-plan.
  • Figure 3: Examples of the evaluation and optimization process of meta-plan via multi-turn discussion between agents. The discussion is triggered by new progress, i.e., Alice founds new object 'purse'. Here, Alice acts as the meta-plan designer, while Bob serves as the meta-plan evaluator. The example is derived from the transporting task of TDW-MAT.
  • Figure 4: Two types of new progress during task execution. (a) Discover a new object poundcake. (b) complete a subtask.
  • Figure 5: Comparison of Transport Rate (%) of CoELA and CaPo using GPT-3.5 under different time steps.
  • ...and 8 more figures