Table of Contents
Fetching ...

ReAcTree: Hierarchical LLM Agent Trees with Control Flow for Long-Horizon Task Planning

Jae-Woo Choi, Hyungmin Kim, Hyobin Ong, Minsu Jang, Dohyung Kim, Jaehong Kim, Youngwoo Yoon

TL;DR

ReAcTree tackles long-horizon planning under partial observability by constructing a dynamic, hierarchical agent tree where each node handles a subgoal and can reason, act, or expand, while control-flow nodes implement sequence, fallback, or parallel execution to coordinate subgoals. Two memory systems—episodic memory for subgoal-level in-context examples and working memory for environment observations—enhance in-context reasoning and cross-node awareness. Empirically, ReAcTree and its memory-augmented variant outperform strong baselines like ReAct and Tree-Planner across LoTa-Bench tasks (WAH-NL and ALFRED) and multiple LLMs, including gains with smaller models; ablations illuminate the critical roles of memory and control-flow expressivity. The results demonstrate robust, scalable long-horizon planning for embodied agents and highlight design directions for memory integration and dynamic subgoal expansion in real-world settings.

Abstract

Recent advancements in large language models (LLMs) have enabled significant progress in decision-making and task planning for embodied autonomous agents. However, most existing methods still struggle with complex, long-horizon tasks because they rely on a monolithic trajectory that entangles all past decisions and observations, attempting to solve the entire task in a single unified process. To address this limitation, we propose ReAcTree, a hierarchical task-planning method that decomposes a complex goal into more manageable subgoals within a dynamically constructed agent tree. Each subgoal is handled by an LLM agent node capable of reasoning, acting, and further expanding the tree, while control flow nodes coordinate the execution strategies of agent nodes. In addition, we integrate two complementary memory systems: each agent node retrieves goal-specific, subgoal-level examples from episodic memory and shares environment-specific observations through working memory. Experiments on the WAH-NL and ALFRED datasets demonstrate that ReAcTree consistently outperforms strong task-planning baselines such as ReAct across diverse LLMs. Notably, on WAH-NL, ReAcTree achieves a 61% goal success rate with Qwen 2.5 72B, nearly doubling ReAct's 31%.

ReAcTree: Hierarchical LLM Agent Trees with Control Flow for Long-Horizon Task Planning

TL;DR

ReAcTree tackles long-horizon planning under partial observability by constructing a dynamic, hierarchical agent tree where each node handles a subgoal and can reason, act, or expand, while control-flow nodes implement sequence, fallback, or parallel execution to coordinate subgoals. Two memory systems—episodic memory for subgoal-level in-context examples and working memory for environment observations—enhance in-context reasoning and cross-node awareness. Empirically, ReAcTree and its memory-augmented variant outperform strong baselines like ReAct and Tree-Planner across LoTa-Bench tasks (WAH-NL and ALFRED) and multiple LLMs, including gains with smaller models; ablations illuminate the critical roles of memory and control-flow expressivity. The results demonstrate robust, scalable long-horizon planning for embodied agents and highlight design directions for memory integration and dynamic subgoal expansion in real-world settings.

Abstract

Recent advancements in large language models (LLMs) have enabled significant progress in decision-making and task planning for embodied autonomous agents. However, most existing methods still struggle with complex, long-horizon tasks because they rely on a monolithic trajectory that entangles all past decisions and observations, attempting to solve the entire task in a single unified process. To address this limitation, we propose ReAcTree, a hierarchical task-planning method that decomposes a complex goal into more manageable subgoals within a dynamically constructed agent tree. Each subgoal is handled by an LLM agent node capable of reasoning, acting, and further expanding the tree, while control flow nodes coordinate the execution strategies of agent nodes. In addition, we integrate two complementary memory systems: each agent node retrieves goal-specific, subgoal-level examples from episodic memory and shares environment-specific observations through working memory. Experiments on the WAH-NL and ALFRED datasets demonstrate that ReAcTree consistently outperforms strong task-planning baselines such as ReAct across diverse LLMs. Notably, on WAH-NL, ReAcTree achieves a 61% goal success rate with Qwen 2.5 72B, nearly doubling ReAct's 31%.

Paper Structure

This paper contains 28 sections, 2 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: An illustrative example of how ReAcTree generates an agent tree for the natural language instruction: Please bring one pudding and one juice to the coffee table. The left side shows the hierarchical structure with agent nodes (circles) and control flow nodes (squares); the number inside each circle denotes the execution order, and the attached text box specifies the corresponding subgoal. Control flow nodes are labeled by their types ($\rightarrow$ for sequence, ? for fallback, and $\Rightarrow$ for parallel). The right side highlights the decision-making trajectory of agent node 3, including observation, reasoning, acting, subgoal expansion, and the use of episodic and working memory.
  • Figure 2: Illustration of agent node execution and control flow node execution in ReAcTree.
  • Figure 3: Success case of ReAcTree on the WAH-NL dataset using Qwen 2.5 72B. The snapshot of each step and the tree structure of nodes are shown. The subgoal of each node is also listed.
  • Figure 4: Categories and subcategories of failure cases for ReAcTree+WM.
  • Figure 5: Failure case of ReAct+WM on the WAH-NL dataset using Qwen 2.5 72B.
  • ...and 2 more figures