When does Chain-of-Thought Help: A Markovian Perspective
Zihan Wang, Yijun Dong, Qi Lei
TL;DR
This work analyzes when and why CoT helps and quantifies how noise in intermediate steps modulates CoT's benefit, and designs synthetic benchmarks that isolate transition alignment and noise in intermediate steps to complement prior results on real-world tasks.
Abstract
Chain-of-Thought (CoT) prompting is a widely used inference-time technique for improving reasoning, yet its gains are uneven across tasks. We analyze when and why CoT helps by modeling the step-wise reasoning trajectory as a Markov chain. Each intermediate step is a state and the dependence between steps is captured by a transition kernel. Our theory identifies transition alignment, whether instances share a common step-wise transition kernel, as the key determinant of CoT's effectiveness. When transitions are identical across steps, CoT reduces inference-time sample complexity: fewer context sample trajectories suffice to recover the final decision. In contrast, when transitions differ across steps, these gains can vanish. We further quantify how noise in intermediate steps modulates CoT's benefit. Beyond theory, we design synthetic benchmarks that isolate these factors to complement prior results on real-world tasks and to empirically validate our predictions.
