Table of Contents
Fetching ...

A Training-free LLM Framework with Interaction between Contextually Related Subtasks in Solving Complex Tasks

Hongjia Liu, Jinlong Li

TL;DR

This paper tackles the problem of information loss and redundant work when solving complex tasks by decomposing them into subtasks executed in isolation. It introduces Interactions For task Decomposition (IFD), a training-free framework featuring a subtask trajectory memory and an overview-based execution summary, plus an interact action to enable cross-subtask queries and responses. The methodology combines a dynamic planner, a ReAct-style executor, and an interaction handler to permit information exchange between related subtasks without additional training. Empirical results on WebShop and HotpotQA show that IFD achieves state-of-the-art performance among training-free baselines, validating that structured interaction and concise overviews improve efficiency and multi-hop reasoning while reducing context complexity.

Abstract

Large language models (LLMs) have shown remarkable capabilities in solving complex tasks. Recent work has explored decomposing such tasks into subtasks with independent contexts. However, some contextually related subtasks may encounter information loss during execution, leading to redundant operations or execution failures. To address this issue, we propose a training-free framework with an interaction mechanism, which enables a subtask to query specific information or trigger certain actions in completed subtasks by sending requests. To implement interaction, we introduce a subtask trajectory memory to enable resumption of completed subtasks upon receiving interaction requests. Additionally, we propose a new action during execution, which generates a concise and precise description of execution process and outcomes of a subtask, to assist subsequent subtasks in determining interaction targets and requests. We evaluate our framework on interactive decision-making task WebShop and multi-hop question answering HotpotQA, with GPT-3.5 and GPT-4, and comparison results show that our framework outperforms the state-of-the-art training-free baselines.

A Training-free LLM Framework with Interaction between Contextually Related Subtasks in Solving Complex Tasks

TL;DR

This paper tackles the problem of information loss and redundant work when solving complex tasks by decomposing them into subtasks executed in isolation. It introduces Interactions For task Decomposition (IFD), a training-free framework featuring a subtask trajectory memory and an overview-based execution summary, plus an interact action to enable cross-subtask queries and responses. The methodology combines a dynamic planner, a ReAct-style executor, and an interaction handler to permit information exchange between related subtasks without additional training. Empirical results on WebShop and HotpotQA show that IFD achieves state-of-the-art performance among training-free baselines, validating that structured interaction and concise overviews improve efficiency and multi-hop reasoning while reducing context complexity.

Abstract

Large language models (LLMs) have shown remarkable capabilities in solving complex tasks. Recent work has explored decomposing such tasks into subtasks with independent contexts. However, some contextually related subtasks may encounter information loss during execution, leading to redundant operations or execution failures. To address this issue, we propose a training-free framework with an interaction mechanism, which enables a subtask to query specific information or trigger certain actions in completed subtasks by sending requests. To implement interaction, we introduce a subtask trajectory memory to enable resumption of completed subtasks upon receiving interaction requests. Additionally, we propose a new action during execution, which generates a concise and precise description of execution process and outcomes of a subtask, to assist subsequent subtasks in determining interaction targets and requests. We evaluate our framework on interactive decision-making task WebShop and multi-hop question answering HotpotQA, with GPT-3.5 and GPT-4, and comparison results show that our framework outperforms the state-of-the-art training-free baselines.

Paper Structure

This paper contains 28 sections, 10 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Two interaction requests generated in question-answering task. The red dashed box: Request generated based on goal of subtask 1, which leads to an invalid request because subtasks 1 is not accomplished as expected. The green dashed box: Request generated based on overview of subtask 1, which is responded with effective information.
  • Figure 2: The framework of IFD, illustrating the execution process of subtask $T_{i+1}$. The planner generates subtask $T_{i+1}$ based on the completed subtasks and their overviews. The executor processes the subtask $T_{i+1}$ in a ReAcT-style trajectory until generating an interaction with the subtask $T_d$ in the form of request. The interaction handler resumes subtask $T_d$ by loading its trajectory from the subtask trajectory memory, and produces the response with iterative execution. After receiving the response, the executor continues executing the subtask $T_{i+1}$ and generates an overview upon completion. Lastly, the trajectory and overview of the subtask $T_{i+1}$ are stored in the subtask trajectory memory.
  • Figure 3: Execution process of IFD in webshop, includes two kinds of interactions. The search subtask searches for results. The check subtasks request the search subtask for the ID of the next product that meets specific requirements. The buy subtask requests the search subtask to navigate to the page containing the certain product. To respond to these request, the search subtask may need to perform page flipping.
  • Figure 4: The React method with page flipping functionality encounters issues on several examples.

Theorems & Definitions (3)

  • Example 1
  • Example 2
  • Example 3