Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Zorik Gekhman; Roee Aharoni; Eran Ofek; Mor Geva; Roi Reichart; Jonathan Herzig

Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Zorik Gekhman, Roee Aharoni, Eran Ofek, Mor Geva, Roi Reichart, Jonathan Herzig

TL;DR

This work designs a series of hypothesis-driven controlled experiments, and identifies two key driving mechanisms: a computational buffer effect and factual priming, where generating topically related facts acts as a semantic bridge that facilitates correct answer retrieval.

Abstract

While reasoning in LLMs plays a natural role in math, code generation, and multi-hop factual questions, its effect on simple, single-hop factual questions remains unclear. Such questions do not require step-by-step logical decomposition, making the utility of reasoning highly counterintuitive. Nevertheless, we find that enabling reasoning substantially expands the capability boundary of the model's parametric knowledge recall, unlocking correct answers that are otherwise effectively unreachable. Why does reasoning aid parametric knowledge recall when there are no complex reasoning steps to be done? To answer this, we design a series of hypothesis-driven controlled experiments, and identify two key driving mechanisms: (1) a computational buffer effect, where the model uses the generated reasoning tokens to perform latent computation independent of their semantic content; and (2) factual priming, where generating topically related facts acts as a semantic bridge that facilitates correct answer retrieval. Importantly, this latter generative self-retrieval mechanism carries inherent risks: we demonstrate that hallucinating intermediate facts during reasoning increases the likelihood of hallucinations in the final answer. Finally, we show that our insights can be harnessed to directly improve model accuracy by prioritizing reasoning trajectories that contain hallucination-free factual statements.

Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

TL;DR

Abstract

Paper Structure (24 sections, 1 equation, 18 figures, 2 tables)

This paper contains 24 sections, 1 equation, 18 figures, 2 tables.

Introduction
Setup
Reasoning Expands The Model's Parametric Knowledge Boundary
Question Complexity is a Poor Predictor of Reasoning Effectiveness
How Reasoning Improves Parametric Recall?
Reasoning Tokens as a Computational Buffer
Factual Priming
Reasoning Hallucinations Encourage Final Answer Hallucinations
From Analysis to Practice: Improving Accuracy by Sampling High-Potential Traces
Case Studies
The Computational Buffer Effect
Factual Priming
Related Work
Conclusion
Appendix
...and 9 more sections

Figures (18)

Figure 1: Pass@$k$ curves across two closed-book QA benchmarks and three LLMs, comparing the same models with reasoning OFF vs ON.
Figure 2: $\Omega$ in all models and datasets (§ \ref{['sec:setup']}). Models organized from the most (left) to the least effective (right) in terms of pass@$1$.
Figure 3: Reasoning Effectiveness on different question types in SimpleQA-Verified, with 95% confidence intervals.
Figure 4: Computation buffer effect on Gemini-2.5-Flash (§ \ref{['sec:test_time_compute']}). ON Single Dummy overrides the thinking trace with a short dummy sequence. ON Dummy does the same, but repeats the short dummy sequence to match the length of the original trace.
Figure 5: Reasoning effectiveness (Equation \ref{['eq:omega']}) as a function of the input length in tokens when conditioning on dummy reasoning trace (see § \ref{['sec:test_time_compute']}). ON Dummy X overrides the reasoning trace with a short dummy sequence which is repeated such that the input length will be X.
...and 13 more figures

Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

TL;DR

Abstract

Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (18)