Understanding Chain-of-Thought in LLMs through Information Theory
Jean-Francois Ton, Muhammad Faaiz Taufiq, Yang Liu
TL;DR
This paper tackles the challenge of evaluating chain-of-thought (CoT) reasoning in LLMs without relying on annotated intermediate steps. It introduces an information-theoretic framework that quantifies information-gain at each reasoning step via a supervisor model to estimate p(Y|X^M_j), enabling detection of unidentifiable or non-contributory steps. The approach is validated on toy data, GSM8K, and PRM800K, where it outperforms outcome-based baselines like ORM and Math-Shepherd in identifying faulty CoT steps and providing granular, task-specific diagnostics. The results suggest that information-gain-based evaluation offers more reliable, scalable insights for CoT quality and generalizes across nonlinear or exploratory reasoning patterns, with potential extensions to non-mathematical domains.
Abstract
Large Language Models (LLMs) have shown impressive performance in complex reasoning tasks through the use of Chain-of-Thought (CoT) reasoning, allowing models to break down problems into manageable sub-tasks. However, existing CoT evaluation techniques either require annotated CoT data or fall short in accurately assessing intermediate reasoning steps, leading to high rates of false positives. In this paper, we formalize CoT reasoning in LLMs through an information-theoretic lens. Specifically, our framework quantifies the `information-gain' at each reasoning step, enabling the identification of failure modes in LLMs without the need for expensive annotated datasets. We demonstrate the efficacy of our approach through extensive experiments on toy arithmetic, GSM8K and PRM800k datasets, where it significantly outperforms existing outcome-based methods by providing more accurate insights into model performance on individual subtasks.
