Table of Contents
Fetching ...

A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning

Lijie Hu, Liang Liu, Shu Yang, Xin Chen, Hongru Xiao, Mengdi Li, Pan Zhou, Muhammad Asif Ali, Di Wang

TL;DR

This work tackles the lack of principled explanations for Chain-of-Thought (CoT) reasoning in large language models and introduces a Hopfieldian View-based Read-and-Control framework. The approach models CoT as stimulus-driven activation of latent concepts, using Concept Modeling, Concept Simulation, and Representations Reading/Controlling to read and steer the reasoning path. A Bayesian-in-context formulation, $P(r|p) = \int_{c} P(r|c,p) P(c|p) \, dc$, formalizes how prompts activate concepts learned during pre-training and a reading vector $v$ (derived via LAT and PCA) enables error localization and guidance. Across seven datasets spanning arithmetic, commonsense, and symbolic tasks, the framework yields improved accuracy, interpretable error localization, and controllable reasoning, demonstrating practical gains in CoT transparency and reliability.

Abstract

Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models (LLMs). While some studies focus on improving CoT accuracy through methods like retrieval enhancement, yet a rigorous explanation for why CoT achieves such success remains unclear. In this paper, we analyze CoT methods under two different settings by asking the following questions: (1) For zero-shot CoT, why does prompting the model with "let's think step by step" significantly impact its outputs? (2) For few-shot CoT, why does providing examples before questioning the model could substantially improve its reasoning ability? To answer these questions, we conduct a top-down explainable analysis from the Hopfieldian view and propose a Read-and-Control approach for controlling the accuracy of CoT. Through extensive experiments on seven datasets for three different tasks, we demonstrate that our framework can decipher the inner workings of CoT, provide reasoning error localization, and control to come up with the correct reasoning path.

A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning

TL;DR

This work tackles the lack of principled explanations for Chain-of-Thought (CoT) reasoning in large language models and introduces a Hopfieldian View-based Read-and-Control framework. The approach models CoT as stimulus-driven activation of latent concepts, using Concept Modeling, Concept Simulation, and Representations Reading/Controlling to read and steer the reasoning path. A Bayesian-in-context formulation, , formalizes how prompts activate concepts learned during pre-training and a reading vector (derived via LAT and PCA) enables error localization and guidance. Across seven datasets spanning arithmetic, commonsense, and symbolic tasks, the framework yields improved accuracy, interpretable error localization, and controllable reasoning, demonstrating practical gains in CoT transparency and reliability.

Abstract

Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models (LLMs). While some studies focus on improving CoT accuracy through methods like retrieval enhancement, yet a rigorous explanation for why CoT achieves such success remains unclear. In this paper, we analyze CoT methods under two different settings by asking the following questions: (1) For zero-shot CoT, why does prompting the model with "let's think step by step" significantly impact its outputs? (2) For few-shot CoT, why does providing examples before questioning the model could substantially improve its reasoning ability? To answer these questions, we conduct a top-down explainable analysis from the Hopfieldian view and propose a Read-and-Control approach for controlling the accuracy of CoT. Through extensive experiments on seven datasets for three different tasks, we demonstrate that our framework can decipher the inner workings of CoT, provide reasoning error localization, and control to come up with the correct reasoning path.
Paper Structure (40 sections, 6 equations, 11 figures, 13 tables, 2 algorithms)

This paper contains 40 sections, 6 equations, 11 figures, 13 tables, 2 algorithms.

Figures (11)

  • Figure 1: Cognitive Brain
  • Figure 2: Neural Network
  • Figure 4: Our CoT explanation framework based on Hopfieldian view.
  • Figure 5: A real case of reasoning error localization by using LLaMA-2-7B-chat in a zero-shot scenario for the GSM8K dataset using our framework. The green bar indicates that the reasoning snippet is correct, and the red bar means that the reasoning snippet may be wrong.
  • Figure 6: A real case predicted by LLaMA-3-8B-instruct with few-shot CoT on the coin flip dataset. The purple part is an example of input-output pairs given by user. The segment highlighted in blue represents the correct output of the model. The red part shows that the model starts to reason in the wrong direction without control, while the green portion indicates the model reason in the correct direction after adding control.
  • ...and 6 more figures