Table of Contents
Fetching ...

TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation

Yaoxiang Wang, Zhiyong Wu, Junfeng Yao, Jinsong Su

TL;DR

TDAG tackles real-world, multi-step tasks with LLM-based agents by dynamically decomposing tasks and generating specialized subagents. ItineraryBench provides a fine-grained travel-planning benchmark and simulator to evaluate memory, planning, and tool usage across progressively complex tasks. Empirical results show TDAG outperforms strong baselines, with ablations confirming the value of both dynamic decomposition and agent generation. Together, the framework and benchmark offer a practical path toward adaptable, memory-aware agents capable of reliable real-world problem solving.

Abstract

The emergence of Large Language Models (LLMs) like ChatGPT has inspired the development of LLM-based agents capable of addressing complex, real-world tasks. However, these agents often struggle during task execution due to methodological constraints, such as error propagation and limited adaptability. To address this issue, we propose a multi-agent framework based on dynamic Task Decomposition and Agent Generation (TDAG). This framework dynamically decomposes complex tasks into smaller subtasks and assigns each to a specifically generated subagent, thereby enhancing adaptability in diverse and unpredictable real-world tasks. Simultaneously, existing benchmarks often lack the granularity needed to evaluate incremental progress in complex, multi-step tasks. In response, we introduce ItineraryBench in the context of travel planning, featuring interconnected, progressively complex tasks with a fine-grained evaluation system. ItineraryBench is designed to assess agents' abilities in memory, planning, and tool usage across tasks of varying complexity. Our experimental results reveal that TDAG significantly outperforms established baselines, showcasing its superior adaptability and context awareness in complex task scenarios.

TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation

TL;DR

TDAG tackles real-world, multi-step tasks with LLM-based agents by dynamically decomposing tasks and generating specialized subagents. ItineraryBench provides a fine-grained travel-planning benchmark and simulator to evaluate memory, planning, and tool usage across progressively complex tasks. Empirical results show TDAG outperforms strong baselines, with ablations confirming the value of both dynamic decomposition and agent generation. Together, the framework and benchmark offer a practical path toward adaptable, memory-aware agents capable of reliable real-world problem solving.

Abstract

The emergence of Large Language Models (LLMs) like ChatGPT has inspired the development of LLM-based agents capable of addressing complex, real-world tasks. However, these agents often struggle during task execution due to methodological constraints, such as error propagation and limited adaptability. To address this issue, we propose a multi-agent framework based on dynamic Task Decomposition and Agent Generation (TDAG). This framework dynamically decomposes complex tasks into smaller subtasks and assigns each to a specifically generated subagent, thereby enhancing adaptability in diverse and unpredictable real-world tasks. Simultaneously, existing benchmarks often lack the granularity needed to evaluate incremental progress in complex, multi-step tasks. In response, we introduce ItineraryBench in the context of travel planning, featuring interconnected, progressively complex tasks with a fine-grained evaluation system. ItineraryBench is designed to assess agents' abilities in memory, planning, and tool usage across tasks of varying complexity. Our experimental results reveal that TDAG significantly outperforms established baselines, showcasing its superior adaptability and context awareness in complex task scenarios.
Paper Structure (29 sections, 6 equations, 5 figures, 6 tables)

This paper contains 29 sections, 6 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Example of a travel planning task. The task specifies initial conditions and constraints for generating a travel itinerary.
  • Figure 2: Overview of TDAG framework. It shows the process of decomposing a complex task into multiple subtasks, which are then dynamically updated based on the completion status of preceding subtasks. Each subtask is assigned to a specially generated subagent, ensuring targeted and efficient task execution.
  • Figure 3: Comparison of method performance using binary scoring and fine-grained evaluation.
  • Figure 4: City Number Distribution.
  • Figure 5: Attraction Number Distribution.