Table of Contents
Fetching ...

SIT-Graph: State Integrated Tool Graph for Multi-Turn Agents

Sijia Li, Yuchen Huang, Zifan Liu, Zijian Li, Jingjing fu, Lei Song, Jiang Bian, Jun Zhang, Rui Wang

TL;DR

The paper tackles the challenge of multi-turn tool use under partial observability by introducing SIT-Graph, a State-Integrated Tool Graph that unifies episodic-like state fragments with procedural tool dependencies. It augments a tool graph with compact edge-level state summaries and a dedicated state-summarization tool, enabling adaptive retrieval between episodic recall and procedural execution. The method is evaluated on four stateful benchmarks, showing consistent gains over memory-based and tool-graph baselines, with pronounced improvements for weaker base models and in online, training-free settings. These results suggest SIT-Graph enables more robust tool selection and effective experience transfer in open-ended, evolving task environments, approaching human-like decision-making in multi-turn interactions.

Abstract

Despite impressive advances in agent systems, multi-turn tool-use scenarios remain challenging. It is mainly because intent is clarified progressively and the environment evolves with each tool call. While reusing past experience is natural, current LLM agents either treat entire trajectories or pre-defined subtasks as indivisible units, or solely exploit tool-to-tool dependencies, hindering adaptation as states and information evolve across turns. In this paper, we propose a State Integrated Tool Graph (SIT-Graph), which enhances multi-turn tool use by exploiting partially overlapping experience. Inspired by human decision-making that integrates episodic and procedural memory, SIT-Graph captures both compact state representations (episodic-like fragments) and tool-to-tool dependencies (procedural-like routines) from historical trajectories. Specifically, we first build a tool graph from accumulated tool-use sequences, and then augment each edge with a compact state summary of the dialog and tool history that may shape the next action. At inference time, SIT-Graph enables a human-like balance between episodic recall and procedural execution: when the next decision requires recalling prior context, the agent retrieves the state summaries stored on relevant edges and uses them to guide its next action; when the step is routine, it follows high-confidence tool dependencies without explicit recall. Experiments across multiple stateful multi-turn tool-use benchmarks show that SIT-Graph consistently outperforms strong memory- and graph-based baselines, delivering more robust tool selection and more effective experience transfer.

SIT-Graph: State Integrated Tool Graph for Multi-Turn Agents

TL;DR

The paper tackles the challenge of multi-turn tool use under partial observability by introducing SIT-Graph, a State-Integrated Tool Graph that unifies episodic-like state fragments with procedural tool dependencies. It augments a tool graph with compact edge-level state summaries and a dedicated state-summarization tool, enabling adaptive retrieval between episodic recall and procedural execution. The method is evaluated on four stateful benchmarks, showing consistent gains over memory-based and tool-graph baselines, with pronounced improvements for weaker base models and in online, training-free settings. These results suggest SIT-Graph enables more robust tool selection and effective experience transfer in open-ended, evolving task environments, approaching human-like decision-making in multi-turn interactions.

Abstract

Despite impressive advances in agent systems, multi-turn tool-use scenarios remain challenging. It is mainly because intent is clarified progressively and the environment evolves with each tool call. While reusing past experience is natural, current LLM agents either treat entire trajectories or pre-defined subtasks as indivisible units, or solely exploit tool-to-tool dependencies, hindering adaptation as states and information evolve across turns. In this paper, we propose a State Integrated Tool Graph (SIT-Graph), which enhances multi-turn tool use by exploiting partially overlapping experience. Inspired by human decision-making that integrates episodic and procedural memory, SIT-Graph captures both compact state representations (episodic-like fragments) and tool-to-tool dependencies (procedural-like routines) from historical trajectories. Specifically, we first build a tool graph from accumulated tool-use sequences, and then augment each edge with a compact state summary of the dialog and tool history that may shape the next action. At inference time, SIT-Graph enables a human-like balance between episodic recall and procedural execution: when the next decision requires recalling prior context, the agent retrieves the state summaries stored on relevant edges and uses them to guide its next action; when the step is routine, it follows high-confidence tool dependencies without explicit recall. Experiments across multiple stateful multi-turn tool-use benchmarks show that SIT-Graph consistently outperforms strong memory- and graph-based baselines, delivering more robust tool selection and more effective experience transfer.

Paper Structure

This paper contains 26 sections, 6 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Comparison with episodic memory and tool graph based methods in multi-turn tool-use tasks. (a) In memory-based methods, memories are stored as discrete items, but these approaches typically incorporate only episodic event memories and retrieve trajectory-level experience. Before the task objective becomes sufficiently clear, this retrieval process may return mismatched or inappropriate experiences, which can lead to task failure. (b) Tool graphs typically encode dependency relationships between agents, relying solely on tool dependencies extracted from previous experience. However, they lack consideration of task-specific states. (c) Our method represents both episodic and procedural memory within a unified structure and adaptively leverages these different types of memory at the state level. In this case, the actual issue lies in the mobile phone settings, and the agent can only confirm the underlying problem after invoking the information tool, which updates the state. However, (a) would retrieve a solution based solely on the observed symptoms before the agent has accurately identified the true cause, resulting in an incorrect response. (b), on the other hand, tends to select the tool that has been most frequently used in past trajectories or is highly connected in the tool-dependency graph. Because it ignores the current state and task-specific information, it also often leads to the selection of an incorrect tool.
  • Figure 2: Building the graph. We build the graph based on previous successful trajectories. Except for the summarization tool, all tools invoked in the historical trajectory are treated as nodes, and edges are constructed to connect them according to their calling sequential relationships. The state information summarized by the summarization tool is treated as edge information. Each edge must contain a weight and may additionally include the corresponding state information, depending on whether the agent chooses to invoke the summarization tool between the two tool nodes.
  • Figure 3: Adaptively leveraging the graph. First, we locate the last tool call in the graph and identify the candidate node it is connected to. Then, the agent decides whether it should summarize the current state. If the decision is yes, the agent calls the summary tool and abstracts the current state as a summary. It then compares the similarity between this summarized state and the stored states in the connected edges, selecting the two with the highest similarities as the next tool candidates. If the decision is no, the agent directly selects the next two tools by comparing the edge weights. Both the similarity comparison and the weight comparison methods return two candidate tools for suggestion.