Table of Contents
Fetching ...

Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures

Tushar Pandey, Ara Ghukasyan, Oktay Goktas, Santosh Kumar Radha

TL;DR

This work introduces Adaptive Graph of Thoughts (AGoT), a test-time, graph-based inference framework that dynamically decomposes complex prompts into a directed acyclic graph of subproblems. By unifying chain, tree, and graph reasoning with complexity checks and nested graphs, AGoT achieves substantial performance gains across reasoning, retrieval, and explorative tasks without any model pre-training or fine-tuning. On benchmarks such as GPQA, HotpotQA, MoreHopQA, HybridQA, Mini-crosswords, and Game of 24, AGoT attains up to a $+46.2\%$ improvement over direct IO on shuffled GPQA data and strong LAAS gains across retrieval tasks, demonstrating comparable benefits to model distillation while remaining computation-lean at inference. These results highlight the practical value of dynamic decomposition and structured recursion for robust, general-purpose reasoning in large language models, especially when retraining is impractical.

Abstract

Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, yet their performance is highly dependent on the prompting strategy and model scale. While reinforcement learning and fine-tuning have been deployed to boost reasoning, these approaches incur substantial computational and data overhead. In this work, we introduce Adaptive Graph of Thoughts (AGoT), a dynamic, graph-based inference framework that enhances LLM reasoning solely at test time. Rather than relying on fixed-step methods like Chain of Thought (CoT) or Tree of Thoughts (ToT), AGoT recursively decomposes complex queries into structured subproblems, forming an dynamic directed acyclic graph (DAG) of interdependent reasoning steps. By selectively expanding only those subproblems that require further analysis, AGoT unifies the strengths of chain, tree, and graph paradigms into a cohesive framework that allocates computation where it is most needed. We validate our approach on diverse benchmarks spanning multi-hop retrieval, scientific reasoning, and mathematical problem-solving, achieving up to 46.2% improvement on scientific reasoning tasks (GPQA) - comparable to gains achieved through computationally intensive reinforcement learning approaches and outperforming state-of-the-art iterative approaches. These results suggest that dynamic decomposition and structured recursion offer a scalable, cost-effective alternative to post-training modifications, paving the way for more robust, general-purpose reasoning in LLMs.

Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures

TL;DR

This work introduces Adaptive Graph of Thoughts (AGoT), a test-time, graph-based inference framework that dynamically decomposes complex prompts into a directed acyclic graph of subproblems. By unifying chain, tree, and graph reasoning with complexity checks and nested graphs, AGoT achieves substantial performance gains across reasoning, retrieval, and explorative tasks without any model pre-training or fine-tuning. On benchmarks such as GPQA, HotpotQA, MoreHopQA, HybridQA, Mini-crosswords, and Game of 24, AGoT attains up to a improvement over direct IO on shuffled GPQA data and strong LAAS gains across retrieval tasks, demonstrating comparable benefits to model distillation while remaining computation-lean at inference. These results highlight the practical value of dynamic decomposition and structured recursion for robust, general-purpose reasoning in large language models, especially when retraining is impractical.

Abstract

Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, yet their performance is highly dependent on the prompting strategy and model scale. While reinforcement learning and fine-tuning have been deployed to boost reasoning, these approaches incur substantial computational and data overhead. In this work, we introduce Adaptive Graph of Thoughts (AGoT), a dynamic, graph-based inference framework that enhances LLM reasoning solely at test time. Rather than relying on fixed-step methods like Chain of Thought (CoT) or Tree of Thoughts (ToT), AGoT recursively decomposes complex queries into structured subproblems, forming an dynamic directed acyclic graph (DAG) of interdependent reasoning steps. By selectively expanding only those subproblems that require further analysis, AGoT unifies the strengths of chain, tree, and graph paradigms into a cohesive framework that allocates computation where it is most needed. We validate our approach on diverse benchmarks spanning multi-hop retrieval, scientific reasoning, and mathematical problem-solving, achieving up to 46.2% improvement on scientific reasoning tasks (GPQA) - comparable to gains achieved through computationally intensive reinforcement learning approaches and outperforming state-of-the-art iterative approaches. These results suggest that dynamic decomposition and structured recursion offer a scalable, cost-effective alternative to post-training modifications, paving the way for more robust, general-purpose reasoning in LLMs.

Paper Structure

This paper contains 23 sections, 7 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: Performance comparison of reasoning frameworks (Chain of Thought, Autonomous Iteration of Thought, and Adaptive Graph of Thoughts) against input-output baseline using gpt-4o-mini. Bars represent absolute improvement in percentage points across reasoning, retrieval, and explorative task categories. AGoT demonstrates consistent performance gains across all categories, with highest improvements in explorative tasks. (see \ref{['subsubsec:gpqa']}).
  • Figure 2: Architectural comparison of inference frameworks showing structural evolution from linear (CoT) to more complex reasoning patterns. Chain of Thought (CoT) wei2022chain employs sequential reasoning, Tree of Thoughts (ToT) yao2024tree introduces branching pathways, Graph of Thoughts (GoT) enables arbitrary connections with refinement, and Autonomous Iteration of Thought (AIoT) radha2024iterationthoughtleveraginginner implements quasi-linear structure with dynamic guidance. The proposed AGoT framework unifies these approaches through recursive graph structures (orange circles indicating nested computation) while maintaining directed acyclic connectivity. Node symbols: circles represent thought states, squares with checkmarks indicate final answers, and crossed squares denote terminated paths.
  • Figure 3: Schematic showing the layer-wise evolution of AGoT. Each row depicts the creation and completion of a single layer. Miniature nested graphs are superimposed on complex nodes to indicate completion. A check symbol identifies the final node in the top-level graph. All edges are directed downward across neighboring generations. Edges entering complex nodes are implicitly connected to every node in the first layer of the nested graph. Edges exiting complex nodes implicitly originate from the final node of the nested graph.
  • Figure 4: Diagram illustrating a final AGoT state after evaluation of a technical problem from GPQA. Grey labels on the top-level graph identify the single positions that comprise the heritages of these nodes. Each layer in the top graph is labeled from 0 to 2, while layer and heritage labels are omitted from nested graphs for neatness. Text inside each node indicates the generated thought's "title", but does not uniquely identify any node's content.
  • Figure 5: Average absolute difference in performance score for CoT, AIoT, and AGoT versus IO, using gpt-4o-mini. Reasoning category excludes the unshuffled GPQA$_\text{D}$ results (see \ref{['subsubsec:gpqa']}).
  • ...and 2 more figures