Table of Contents
Fetching ...

Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons

Yifei Wang, Yuheng Chen, Wanting Wen, Yu Sheng, Linjing Li, Daniel Dajun Zeng

TL;DR

It is demonstrated that enhancing this recall process of parametric knowledge in LLMs directly improves reasoning performance whereas suppressing it leads to notable degradation, and indicates that CoT can intensify the recall of factual knowledge by encouraging LLMs to engage in orderly and reliable reasoning.

Abstract

In this paper, we investigate whether Large Language Models (LLMs) actively recall or retrieve their internal repositories of factual knowledge when faced with reasoning tasks. Through an analysis of LLMs' internal factual recall at each reasoning step via Knowledge Neurons, we reveal that LLMs fail to harness the critical factual associations under certain circumstances. Instead, they tend to opt for alternative, shortcut-like pathways to answer reasoning questions. By manually manipulating the recall process of parametric knowledge in LLMs, we demonstrate that enhancing this recall process directly improves reasoning performance whereas suppressing it leads to notable degradation. Furthermore, we assess the effect of Chain-of-Thought (CoT) prompting, a powerful technique for addressing complex reasoning tasks. Our findings indicate that CoT can intensify the recall of factual knowledge by encouraging LLMs to engage in orderly and reliable reasoning. Furthermore, we explored how contextual conflicts affect the retrieval of facts during the reasoning process to gain a comprehensive understanding of the factual recall behaviors of LLMs. Code and data will be available soon.

Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons

TL;DR

It is demonstrated that enhancing this recall process of parametric knowledge in LLMs directly improves reasoning performance whereas suppressing it leads to notable degradation, and indicates that CoT can intensify the recall of factual knowledge by encouraging LLMs to engage in orderly and reliable reasoning.

Abstract

In this paper, we investigate whether Large Language Models (LLMs) actively recall or retrieve their internal repositories of factual knowledge when faced with reasoning tasks. Through an analysis of LLMs' internal factual recall at each reasoning step via Knowledge Neurons, we reveal that LLMs fail to harness the critical factual associations under certain circumstances. Instead, they tend to opt for alternative, shortcut-like pathways to answer reasoning questions. By manually manipulating the recall process of parametric knowledge in LLMs, we demonstrate that enhancing this recall process directly improves reasoning performance whereas suppressing it leads to notable degradation. Furthermore, we assess the effect of Chain-of-Thought (CoT) prompting, a powerful technique for addressing complex reasoning tasks. Our findings indicate that CoT can intensify the recall of factual knowledge by encouraging LLMs to engage in orderly and reliable reasoning. Furthermore, we explored how contextual conflicts affect the retrieval of facts during the reasoning process to gain a comprehensive understanding of the factual recall behaviors of LLMs. Code and data will be available soon.
Paper Structure (51 sections, 8 equations, 8 figures, 8 tables)

This paper contains 51 sections, 8 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: An unsuccessful case of reasoning due to factual retrieval failure of the triplet (General Motors, chairperson, Marry Barra).
  • Figure 2: Scaled visualization of neuron activities within the intermediate layers of FFNs in Mistral-7B for the same case (A 32-layer$\times$14336-neuron matrix). The vertical axis shows the depth of layers, while the horizontal axis shows the neuron index in the FFN's intermediate layers. It is evident that KNs are distributed in the middle and final layers.
  • Figure 3: Overall reasoning performance on TFRKN under different CoT situations.
  • Figure 4: An in-depth analysis of shortcut scenarios under no CoT. TT represents successful recall of both facts.
  • Figure 5: Results of constructing the knowledge distraction and knowledge conflict for the first-hop fact.
  • ...and 3 more figures