Table of Contents
Fetching ...

Does Knowledge Localization Hold True? Surprising Differences Between Entity and Relation Perspectives in Language Models

Yifan Wei, Xiaoyan Yu, Yixuan Weng, Huanhuan Ma, Yuanzhe Zhang, Jun Zhao, Kang Liu

TL;DR

Contrary to prior research suggesting that knowledge is stored in MLP weights, the experiments demonstrate that relational knowledge is also significantly encoded in attention modules, highlighting the multifaceted nature of knowledge storage in language models.

Abstract

Large language models encapsulate knowledge and have demonstrated superior performance on various natural language processing tasks. Recent studies have localized this knowledge to specific model parameters, such as the MLP weights in intermediate layers. This study investigates the differences between entity and relational knowledge through knowledge editing. Our findings reveal that entity and relational knowledge cannot be directly transferred or mapped to each other. This result is unexpected, as logically, modifying the entity or the relation within the same knowledge triplet should yield equivalent outcomes. To further elucidate the differences between entity and relational knowledge, we employ causal analysis to investigate how relational knowledge is stored in pre-trained models. Contrary to prior research suggesting that knowledge is stored in MLP weights, our experiments demonstrate that relational knowledge is also significantly encoded in attention modules. This insight highlights the multifaceted nature of knowledge storage in language models, underscoring the complexity of manipulating specific types of knowledge within these models.

Does Knowledge Localization Hold True? Surprising Differences Between Entity and Relation Perspectives in Language Models

TL;DR

Contrary to prior research suggesting that knowledge is stored in MLP weights, the experiments demonstrate that relational knowledge is also significantly encoded in attention modules, highlighting the multifaceted nature of knowledge storage in language models.

Abstract

Large language models encapsulate knowledge and have demonstrated superior performance on various natural language processing tasks. Recent studies have localized this knowledge to specific model parameters, such as the MLP weights in intermediate layers. This study investigates the differences between entity and relational knowledge through knowledge editing. Our findings reveal that entity and relational knowledge cannot be directly transferred or mapped to each other. This result is unexpected, as logically, modifying the entity or the relation within the same knowledge triplet should yield equivalent outcomes. To further elucidate the differences between entity and relational knowledge, we employ causal analysis to investigate how relational knowledge is stored in pre-trained models. Contrary to prior research suggesting that knowledge is stored in MLP weights, our experiments demonstrate that relational knowledge is also significantly encoded in attention modules. This insight highlights the multifaceted nature of knowledge storage in language models, underscoring the complexity of manipulating specific types of knowledge within these models.
Paper Structure (13 sections, 7 equations, 3 figures, 2 tables)

This paper contains 13 sections, 7 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Knowledge stored within model parameters.
  • Figure 2: Causal tracing results of individual model components.
  • Figure 3: Causal effects by isolating various modules.