Table of Contents
Fetching ...

The Kinetics of Reasoning: How Chain-of-Thought Shapes Learning in Transformers?

Zihan Pengmei, Costas Mavromatis, Zhengyuan Shen, Yunyi Zhang, Vassilis N. Ioannidis, Huzefa Rangwala

TL;DR

This study investigates how chain-of-thought (CoT) supervision shapes learning dynamics in transformers using a controlled set of symbolic reasoning tasks and a grokking framework. By contrasting direct-answer and CoT-guided training, it shows that CoT can accelerate generalization and enlarge expressivity on simpler tasks, but may not overcome high algorithmic complexity such as Intersection; it also uncovers a transient unfaithfulness phase where traces diverge from answers before alignment. The authors formalize a three-parameter logistic model for learning curves and an Arrhenius-like second-order framework to interpret how task complexity and data distribution affect learning rates, with CoT effectively lowering learning barriers. Mechanistic analyses via linear probing and causal tracing reveal that CoT shifts computation to earlier, distributed representations and alters causal pathways. These findings illuminate both the potential and the limits of CoT for improving reasoning in transformers and highlight the need for caution when using generated traces as explanations, especially under complex tasks or limited training regimes.

Abstract

Chain-of-thought (CoT) supervision can substantially improve transformer performance, yet the mechanisms by which models learn to follow and benefit from CoT remain poorly understood. We investigate these learning dynamics through the lens of grokking by pretraining transformers on symbolic reasoning tasks with tunable algorithmic complexity and controllable data composition to study their generalization. Models were trained under two settings: (i) producing only final answers, and (ii) emitting explicit CoT traces before answering. Our results show that while CoT generally improves task performance, its benefits depend on task complexity. To quantify these effects, we model the accuracy of the logarithmic training steps with a three-parameter logistic curve, revealing how the learning speed and shape vary with task complexity, data distribution, and the presence of CoT supervision. We also uncover a transient trace unfaithfulness phase: early in training, models often produce correct answers while skipping or contradicting CoT steps, before later aligning their reasoning traces with answers. Empirically, we (1) demonstrate that CoT accelerates generalization but does not overcome tasks with higher algorithmic complexity, such as finding list intersections; (2) introduce a kinetic modeling framework for understanding transformer learning; (3) characterize trace faithfulness as a dynamic property that emerges over training; and (4) show CoT alters internal transformer computation mechanistically.

The Kinetics of Reasoning: How Chain-of-Thought Shapes Learning in Transformers?

TL;DR

This study investigates how chain-of-thought (CoT) supervision shapes learning dynamics in transformers using a controlled set of symbolic reasoning tasks and a grokking framework. By contrasting direct-answer and CoT-guided training, it shows that CoT can accelerate generalization and enlarge expressivity on simpler tasks, but may not overcome high algorithmic complexity such as Intersection; it also uncovers a transient unfaithfulness phase where traces diverge from answers before alignment. The authors formalize a three-parameter logistic model for learning curves and an Arrhenius-like second-order framework to interpret how task complexity and data distribution affect learning rates, with CoT effectively lowering learning barriers. Mechanistic analyses via linear probing and causal tracing reveal that CoT shifts computation to earlier, distributed representations and alters causal pathways. These findings illuminate both the potential and the limits of CoT for improving reasoning in transformers and highlight the need for caution when using generated traces as explanations, especially under complex tasks or limited training regimes.

Abstract

Chain-of-thought (CoT) supervision can substantially improve transformer performance, yet the mechanisms by which models learn to follow and benefit from CoT remain poorly understood. We investigate these learning dynamics through the lens of grokking by pretraining transformers on symbolic reasoning tasks with tunable algorithmic complexity and controllable data composition to study their generalization. Models were trained under two settings: (i) producing only final answers, and (ii) emitting explicit CoT traces before answering. Our results show that while CoT generally improves task performance, its benefits depend on task complexity. To quantify these effects, we model the accuracy of the logarithmic training steps with a three-parameter logistic curve, revealing how the learning speed and shape vary with task complexity, data distribution, and the presence of CoT supervision. We also uncover a transient trace unfaithfulness phase: early in training, models often produce correct answers while skipping or contradicting CoT steps, before later aligning their reasoning traces with answers. Empirically, we (1) demonstrate that CoT accelerates generalization but does not overcome tasks with higher algorithmic complexity, such as finding list intersections; (2) introduce a kinetic modeling framework for understanding transformer learning; (3) characterize trace faithfulness as a dynamic property that emerges over training; and (4) show CoT alters internal transformer computation mechanistically.

Paper Structure

This paper contains 73 sections, 15 equations, 13 figures, 6 tables, 1 algorithm.

Figures (13)

  • Figure 1: CoT as a learning catalyst to accelerate grokking, where the test accuracy follows a predictable logistic function. With CoT, test accuracy groks earlier and reaches a higher ceiling; Answer outpaces Answer+Trace (the unfaithfulness gap) before aligning, while Direct-Answer groks later. Inset: the memorization-to-generalization barrier, as an energy-landscape sketch, showing CoT splitting a hard task into smaller ones.
  • Figure 2: Illustration of the four formal symbolic reasoning tasks investigated.
  • Figure 3: Analysis of model accuracy over (left) log training steps for the $k=3$comparison task (right) log training FLOPs the $k=3$Sorting task. Solid lines represent the averaged experimental results, while dashed lines show the fitted theoretical functions.
  • Figure 4: Unfaithfulness dynamics of the Comparison (Top) and Sorting (Bottom) tasks with $k=3,4,5$. For Sorting task we use the final answer accuracy assigning partial credits. Unfaithfulness follows a reverse double descent pattern, which is deeper for Sorting task. Vertical dashed lines denote the model reached 5$\%$ full accuracy.
  • Figure 5: Answer-probe accuracy (z-axis) on the OOD data in Comparison, plotted as a surface over model layers (x-axis) and training steps (y-axis). Hue colors mean higher probing accuracy from blue to orange.
  • ...and 8 more figures