First Heuristic Then Rational: Dynamic Use of Heuristics in Language Model Reasoning
Yoichi Aoki, Keito Kudo, Tatsuki Kuribayashi, Shusaku Sone, Masaya Taniguchi, Keisuke Sakaguchi, Kentaro Inui
TL;DR
This work examines how language models perform multi-step reasoning by probing the dynamic use of heuristics. Through arithmetic reasoning tasks and carefully engineered distractor variants, it shows that models rely more on heuristics early in the reasoning process and progressively adopt goal-directed, rational strategies as they approach the answer, implying a limited capacity to backtrack across many future steps. The analysis introduces the distance-to-go measure $d$ and the minimal solution $h^*$ within a state-transition framework to quantify this shift, and evaluates multiple models (e.g., PaLM2, Llama2-13B, GPT-3.5, GPT-4) across GSM8K and artificial datasets. The findings offer both cognitive implications—humanoid-like problem solving with dynamic strategy switching—and engineering guidance for prompting and evaluating LMs on complex, multi-step tasks. Limitations include the scope to four models and two task types, suggesting directions for broader validation and mechanistic exploration.
Abstract
Multi-step reasoning instruction, such as chain-of-thought prompting, is widely adopted to explore better language models (LMs) performance. We report on the systematic strategy that LMs employ in such a multi-step reasoning process. Our controlled experiments reveal that LMs rely more heavily on heuristics, such as lexical overlap, in the earlier stages of reasoning, where more reasoning steps remain to reach a goal. Conversely, their reliance on heuristics decreases as LMs progress closer to the final answer through multiple reasoning steps. This suggests that LMs can backtrack only a limited number of future steps and dynamically combine heuristic strategies with rationale ones in tasks involving multi-step reasoning.
