Table of Contents
Fetching ...

RouteGoT: Node-Adaptive Routing for Cost-Efficient Graph of Thoughts Reasoning

Yuhang Liu, Ruijie Wang, Yunlong Chu, Bing Hao, Yumeng Lin, Shengzhong Liu, Minglai Shao

TL;DR

RouteGoT is proposed, a budget-controllable, node-adaptive routing framework for graph-structured reasoning that outperforms existing routing baselines by maintaining a superior cost-accuracy trade-off, demonstrating improved robustness under varying budget targets and tasks.

Abstract

Large Language Models (LLMs) excel at multi-step reasoning, yet increasing the structural complexity of inference does not consistently improve system-level returns. Methods such as Tree of Thoughts (ToT), Graph of Thoughts (GoT), and Adaptive Graph of Thoughts (AGoT) can boost accuracy on some benchmarks, but often introduce substantial overhead in token consumption and latency, and their gains can be unstable across task distributions-sometimes underperforming simpler Chain-of-Thought (CoT) or direct input-output prompting (IO). We attribute this inefficiency to stage-wise and node-wise heterogeneity inside GoT-style reasoning pipelines: high-quality planning and final synthesis are globally coupled and typically benefit from strong models, whereas many intermediate subtasks are localized and can be solved accurately by lighter models with far fewer tokens. Motivated by these observations, we propose RouteGoT, a budget-controllable, node-adaptive routing framework for graph-structured reasoning. RouteGoT performs in-graph routing by prioritizing strong models for planning and synthesis, while dynamically allocating lightweight models and cost-effective strategies to leaf subtasks based on predicted difficulty. It further integrates explicit budget constraints into a global inference scheduler to control graph expansion under a user-specified token budget, enabling predictable performance-cost trade-offs. Experiments across reasoning, retrieval, and multi-hop QA benchmarks show that RouteGoT matching or improving accuracy while substantially reducing token usage; specifically, it achieves an average 8.1 percentage points accuracy improvement and 79.1\% output token reduction compared to AGoT. Furthermore, RouteGoT outperforms existing routing baselines by maintaining a superior cost-accuracy trade-off, demonstrating improved robustness under varying budget targets and tasks.

RouteGoT: Node-Adaptive Routing for Cost-Efficient Graph of Thoughts Reasoning

TL;DR

RouteGoT is proposed, a budget-controllable, node-adaptive routing framework for graph-structured reasoning that outperforms existing routing baselines by maintaining a superior cost-accuracy trade-off, demonstrating improved robustness under varying budget targets and tasks.

Abstract

Large Language Models (LLMs) excel at multi-step reasoning, yet increasing the structural complexity of inference does not consistently improve system-level returns. Methods such as Tree of Thoughts (ToT), Graph of Thoughts (GoT), and Adaptive Graph of Thoughts (AGoT) can boost accuracy on some benchmarks, but often introduce substantial overhead in token consumption and latency, and their gains can be unstable across task distributions-sometimes underperforming simpler Chain-of-Thought (CoT) or direct input-output prompting (IO). We attribute this inefficiency to stage-wise and node-wise heterogeneity inside GoT-style reasoning pipelines: high-quality planning and final synthesis are globally coupled and typically benefit from strong models, whereas many intermediate subtasks are localized and can be solved accurately by lighter models with far fewer tokens. Motivated by these observations, we propose RouteGoT, a budget-controllable, node-adaptive routing framework for graph-structured reasoning. RouteGoT performs in-graph routing by prioritizing strong models for planning and synthesis, while dynamically allocating lightweight models and cost-effective strategies to leaf subtasks based on predicted difficulty. It further integrates explicit budget constraints into a global inference scheduler to control graph expansion under a user-specified token budget, enabling predictable performance-cost trade-offs. Experiments across reasoning, retrieval, and multi-hop QA benchmarks show that RouteGoT matching or improving accuracy while substantially reducing token usage; specifically, it achieves an average 8.1 percentage points accuracy improvement and 79.1\% output token reduction compared to AGoT. Furthermore, RouteGoT outperforms existing routing baselines by maintaining a superior cost-accuracy trade-off, demonstrating improved robustness under varying budget targets and tasks.
Paper Structure (45 sections, 17 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 45 sections, 17 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Cost-Accuracy Trade-off. Background colors distinguish low-cost (light green) from high-cost (light red) regions based on the median token consumption across all methods.
  • Figure 2: Overview of the RouteGoT framework.
  • Figure 3: Performance comparison under varying computational budgets on GPQA.
  • Figure 4: Evaluation of Routing Mechanism (RQ3).(a) Comparison of decision quality metrics (Regret, Utility). (b) Distribution of node complexity across different methods. (c) Performance analysis of the cost predictor.
  • Figure 5: Case study on a GPQA biomedical task. RouteGoT dynamically routes nodes based on semantic triggers. Red nodes represent Decompose actions for high-level planning, blue nodes indicate CoT for relational analysis, and green nodes denote IO for efficient pruning. Compared to AGoT, RouteGoT avoids redundant sub-graph expansion on distractors, identifying the correct answer with substantially reduced token consumption.