Table of Contents
Fetching ...

CoAct: A Global-Local Hierarchy for Autonomous Agent Collaboration

Xinming Hou, Mingming Yang, Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Wayne Xin Zhao

TL;DR

CoAct introduces a two-agent hierarchical framework for autonomous LLM collaboration, combining a global planning agent with a local execution agent to tackle long-horizon, real-world tasks. The global planner crafts macro-level phase plans and subtask descriptions, while the local executor implements subtasks and provides execution feedback to trigger replanning. Evaluated on the WebArena benchmark, CoAct substantially outperforms ReAct, with improvements up to ~70% SR when using force-stop interventions, and analyses identify planning and memory-related bottlenecks as opportunities for enhancement. The work demonstrates that explicit global-local task decomposition and adaptive re-planning enable more robust autonomous web-navigation tasks and suggest useful directions for integrating web-page knowledge and memory into planning.

Abstract

Existing LLMs exhibit remarkable performance on various NLP tasks, but still struggle with complex real-world tasks, even equipped with advanced strategies like CoT and ReAct. In this work, we propose the CoAct framework, which transfers the hierarchical planning and collaboration patterns in human society to LLM systems. Specifically, our CoAct framework involves two agents: (1) A global planning agent, to comprehend the problem scope, formulate macro-level plans and provide detailed sub-task descriptions to local execution agents, which serves as the initial rendition of a global plan. (2) A local execution agent, to operate within the multi-tier task execution structure, focusing on detailed execution and implementation of specific tasks within the global plan. Experimental results on the WebArena benchmark show that CoAct can re-arrange the process trajectory when facing failures, and achieves superior performance over baseline methods on long-horizon web tasks. Code is available at https://github.com/xmhou2002/CoAct.

CoAct: A Global-Local Hierarchy for Autonomous Agent Collaboration

TL;DR

CoAct introduces a two-agent hierarchical framework for autonomous LLM collaboration, combining a global planning agent with a local execution agent to tackle long-horizon, real-world tasks. The global planner crafts macro-level phase plans and subtask descriptions, while the local executor implements subtasks and provides execution feedback to trigger replanning. Evaluated on the WebArena benchmark, CoAct substantially outperforms ReAct, with improvements up to ~70% SR when using force-stop interventions, and analyses identify planning and memory-related bottlenecks as opportunities for enhancement. The work demonstrates that explicit global-local task decomposition and adaptive re-planning enable more robust autonomous web-navigation tasks and suggest useful directions for integrating web-page knowledge and memory into planning.

Abstract

Existing LLMs exhibit remarkable performance on various NLP tasks, but still struggle with complex real-world tasks, even equipped with advanced strategies like CoT and ReAct. In this work, we propose the CoAct framework, which transfers the hierarchical planning and collaboration patterns in human society to LLM systems. Specifically, our CoAct framework involves two agents: (1) A global planning agent, to comprehend the problem scope, formulate macro-level plans and provide detailed sub-task descriptions to local execution agents, which serves as the initial rendition of a global plan. (2) A local execution agent, to operate within the multi-tier task execution structure, focusing on detailed execution and implementation of specific tasks within the global plan. Experimental results on the WebArena benchmark show that CoAct can re-arrange the process trajectory when facing failures, and achieves superior performance over baseline methods on long-horizon web tasks. Code is available at https://github.com/xmhou2002/CoAct.
Paper Structure (21 sections, 4 figures, 5 tables, 1 algorithm)

This paper contains 21 sections, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: The framework of CoAct, which involves a global planning agent and a local execution agent to work together in a hierarchical relationship to accomplish tasks.
  • Figure 2: Workflow of global planning agent.
  • Figure 3: Workflow of local execution agent.
  • Figure 4: An example in the Shop task to show the advantage of CoAct over ReAct.