Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!
Subbarao Kambhampati, Kaya Stechly, Karthik Valmeekam, Lucas Saldyt, Siddhant Bhambri, Vardhan Palod, Atharva Gundawar, Soumya Rani Samineni, Durgesh Kalwar, Upasana Biswas
TL;DR
The paper critiques the common practice of treating intermediate tokens as human-like reasoning traces, arguing that this anthropomorphization is misleading and potentially harmful. It surveys test-time inference and post-training strategies that leverage derivational traces, but presents evidence that the traces often lack stable semantics and do not reliably reflect internal reasoning. It offers alternative viewpoints, such as viewing reasoning as the incremental internalization of verifier signals and promoting prompt augmentation (Skolem function) to improve performance without semantic traces. The work urges the community to pursue more robust explanations for LRMs' capabilities and to pursue research directions that do not rely on anthropomorphizing intermediate tokens.
Abstract
Intermediate token generation (ITG), where a model produces output before the solution, has been proposed as a method to improve the performance of language models on reasoning tasks. These intermediate tokens have been called "reasoning traces" or even "thoughts" -- implicitly anthropomorphizing the model, implying these tokens resemble steps a human might take when solving a challenging problem.In this paper, we present evidence that this anthropomorphization isn't a harmless metaphor, and instead is quite dangerous -- it confuses the nature of these models and how to use them effectively, and leads to questionable research.
