Mitigating Misleading Chain-of-Thought Reasoning with Selective Filtering
Yexin Wu, Zhuosheng Zhang, Hai Zhao
TL;DR
The paper addresses unreliable chain-of-thought (CoT) reasoning in small language models by introducing SelF-Reasoner, a three-component system that generates a candidate reasoning chain, predicts the final answer, and critically filters CoT credibility via a CoT verifier. By applying CoT only when the reasoning chain is entailed by the question and confident, SelF-Reasoner improves accuracy over vanilla fine-tuning and two-level pipelines across ScienceQA, ECQA, and LastLetter, without relying on large LLMs at test time. The authors provide a detailed analysis of reasoning-chain quality, reveal common failure modes in CoT, and demonstrate a scalable “scaling law” for the CoT filter’s impact. The work offers a practical, interpretable approach to leveraging CoT in small models, with broad implications for robust reasoning and downstream applicability in resource-constrained settings.
Abstract
Large language models have manifested remarkable capabilities by leveraging chain-of-thought (CoT) reasoning techniques to solve intricate questions through step-by-step reasoning chains. Despite its success, the efficacy of such reasoning is inherently contingent upon the quality of CoT. However, flawless CoT reasoning cannot be guaranteed due to the presence of indecomposable questions and the potential for erroneous reasoning chains, particularly in the case of small-scale language models. To tackle this challenge, we propose a novel approach called the selective filtering reasoner (SelF-Reasoner) that assesses the entailment relationship between the question and the candidate reasoning chain. Then, we proceed with CoT reasoning when the reasoning chain demonstrates confidence; otherwise, we opt to predict the answer directly. SelF-Reasoner improves the fine-tuned T5 baseline consistently over the ScienceQA, ECQA, and LastLetter tasks. Code is available at \texttt{https://github.com/LibroWu/SelF-Reasoner}.
