Precedent-Informed Reasoning: Mitigating Overthinking in Large Reasoning Models via Test-Time Precedent Learning

Qianyue Wang; Jinwu Hu; Huanxiang Lin; Bolin Chen; Zhiquan Wen; Yaofo Chen; Yu Rong; Mingkui Tan

Precedent-Informed Reasoning: Mitigating Overthinking in Large Reasoning Models via Test-Time Precedent Learning

Qianyue Wang, Jinwu Hu, Huanxiang Lin, Bolin Chen, Zhiquan Wen, Yaofo Chen, Yu Rong, Mingkui Tan

TL;DR

Experiments across mathematical reasoning, scientific QA, and code generation demonstrate that PIR consistently shortens reasoning traces while maintaining or improving final accuracy across LLMs, yielding outstanding accuracy-efficiency trade-offs.

Abstract

Reasoning in Large Language Models (LLMs) often suffers from inefficient long chain-of-thought traces with redundant self-exploration and validation, which inflate computational costs and even degrade performance. Inspired by human reasoning patterns where people solve new problems by leveraging past related cases to constrain search spaces and reduce trial-and-error, we propose Precedent Informed Reasoning (PIR) transforming LRMs'reasoning paradigm from exhaustive self-exploration to guided learning from precedents. PIR addresses two key challenges: what precedents to adopt and how to utilize them. First, Adaptive Precedent Selection (APS) constructs, for each question and LRM, a compact set of precedents that are both semantically related and informative for the model. It ranks examples by a joint score with semantic similarity and model perplexity, then adapts the amount of precedents to maximize perplexity reduction. Second, Test-time Experience Internalization (TEI) is treated as the test-time learning on precedent-informed instruction, updating lightweight adapters to internalize solution patterns and use them as a prior during subsequent reasoning. Experiments across mathematical reasoning, scientific QA, and code generation demonstrate that PIR consistently shortens reasoning traces while maintaining or improving final accuracy across LLMs, yielding outstanding accuracy-efficiency trade-offs.

Precedent-Informed Reasoning: Mitigating Overthinking in Large Reasoning Models via Test-Time Precedent Learning

TL;DR

Abstract

Paper Structure (46 sections, 25 equations, 3 figures, 14 tables, 1 algorithm)

This paper contains 46 sections, 25 equations, 3 figures, 14 tables, 1 algorithm.

Introduction
Related Work
Efficient Reasoning for LRMs
Test time In-context Learning
Problem Statement and Motivation
Precedent-informed Reasoning
Adaptive Precedent Selection
Test-time Experience Internalization
Experiment
Experiment Setting
Comparison Experiment
Reasoning Behavior Analysis
Ablation Study
More Discussion
Conclusion
...and 31 more sections

Figures (3)

Figure 1: The illustration of different reasoning paradigms. Unlike normal reasoning that relies on self-exploration, precedent-informed reasoning leverages precedent experience to reduce the search space, improving reasoning efficiency.
Figure 2: The accuracy and length on DeepSeek-R1-Qwen2.5-7B/32B and QwQ-32B as incrementally vary the number of $RS$-filtered precedents in the reasoning prompt. Best accuracy gains are achieved within the given reference budget. We evaluate 100 randomly sampled instances per dataset. Markers denote the best result for each model-dataset pair: $\textcolor{rgb(31,119,180)}{\bullet}$ as MATH500, $\textcolor{rgb(214,39,40)}{\blacksquare}$ as GPQA Diamond, $\textcolor{rgb(44,157,44)}{\blacktriangle}$ as LiveCode Bench. See Appendix \ref{['sup: pre-exp']} for details.
Figure 3: Scalability of PIR across different reasoning models, with average accuracy (%) and reasoning tokens on different models reported for each dataset.

Precedent-Informed Reasoning: Mitigating Overthinking in Large Reasoning Models via Test-Time Precedent Learning

TL;DR

Abstract

Precedent-Informed Reasoning: Mitigating Overthinking in Large Reasoning Models via Test-Time Precedent Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)