Table of Contents
Fetching ...

Route to Reason: Adaptive Routing for LLM and Reasoning Strategy Selection

Zhihong Pan, Kai Zhang, Yuze Zhao, Yupeng Han

TL;DR

This work introduces Route-to-Reason (RTR), a budget-aware framework that jointly routes both LLMs and reasoning strategies to optimize accuracy under computational cost constraints. RTR uses a dual-representation and dual-predictor setup to generate a routing table that scores model-strategy pairs per query, enabling per-instance selection via a simple trade-off parameter. Across seven open-source LLMs and four reasoning strategies, RTR achieves higher accuracy than the best single model while reducing token usage by over 60%, with strong generalization to out-of-distribution tasks. The approach is plug-and-play, adaptable to black-box or white-box models and strategies, and offers a scalable path toward efficient, adaptive reasoning in multi-model settings.

Abstract

The inherent capabilities of a language model (LM) and the reasoning strategies it employs jointly determine its performance in reasoning tasks. While test-time scaling is regarded as an effective approach to tackling complex reasoning tasks, it incurs substantial computational costs and often leads to "overthinking", where models become trapped in "thought pitfalls". To address this challenge, we propose Route-To-Reason (RTR), a novel unified routing framework that dynamically allocates both LMs and reasoning strategies according to task difficulty under budget constraints. RTR learns compressed representations of both expert models and reasoning strategies, enabling their joint and adaptive selection at inference time. This method is low-cost, highly flexible, and can be seamlessly extended to arbitrary black-box or white-box models and strategies, achieving true plug-and-play functionality. Extensive experiments across seven open source models and four reasoning strategies demonstrate that RTR achieves an optimal trade-off between accuracy and computational efficiency among all baselines, achieving higher accuracy than the best single model while reducing token usage by over 60%.

Route to Reason: Adaptive Routing for LLM and Reasoning Strategy Selection

TL;DR

This work introduces Route-to-Reason (RTR), a budget-aware framework that jointly routes both LLMs and reasoning strategies to optimize accuracy under computational cost constraints. RTR uses a dual-representation and dual-predictor setup to generate a routing table that scores model-strategy pairs per query, enabling per-instance selection via a simple trade-off parameter. Across seven open-source LLMs and four reasoning strategies, RTR achieves higher accuracy than the best single model while reducing token usage by over 60%, with strong generalization to out-of-distribution tasks. The approach is plug-and-play, adaptable to black-box or white-box models and strategies, and offers a scalable path toward efficient, adaptive reasoning in multi-model settings.

Abstract

The inherent capabilities of a language model (LM) and the reasoning strategies it employs jointly determine its performance in reasoning tasks. While test-time scaling is regarded as an effective approach to tackling complex reasoning tasks, it incurs substantial computational costs and often leads to "overthinking", where models become trapped in "thought pitfalls". To address this challenge, we propose Route-To-Reason (RTR), a novel unified routing framework that dynamically allocates both LMs and reasoning strategies according to task difficulty under budget constraints. RTR learns compressed representations of both expert models and reasoning strategies, enabling their joint and adaptive selection at inference time. This method is low-cost, highly flexible, and can be seamlessly extended to arbitrary black-box or white-box models and strategies, achieving true plug-and-play functionality. Extensive experiments across seven open source models and four reasoning strategies demonstrate that RTR achieves an optimal trade-off between accuracy and computational efficiency among all baselines, achieving higher accuracy than the best single model while reducing token usage by over 60%.

Paper Structure

This paper contains 42 sections, 7 equations, 14 figures, 4 tables, 1 algorithm.

Figures (14)

  • Figure 1: We propose Route-to-Reason (RTR), a low-cost and flexible expert selection framework capable of jointly optimizing model and strategy selection.
  • Figure 2: Performance and average answer tokens distribution of two different LLMs when responding to queries from subsets of four reasoning tasks.
  • Figure 3: Performance and average answer tokens distribution of Qwen2.5-14B-Instruct under different reasoning strategies when responding to queries from subsets of four reasoning tasks.
  • Figure 4: The RTR first encodes the input question, available models, and reasoning strategies. Two predictor modules then estimate the expected performance and token usage for each model-strategy combination, generating a routing table. Finally, the router selects the most suitable model-strategy pair that balances accuracy and efficiency for each question.
  • Figure 5: Distribution of the performance and average answer token of different LLMs in response to queries on the 4 reasoning tasks.
  • ...and 9 more figures