On Relation-Specific Neurons in Large Language Models
Yihong Liu, Runsheng Chen, Lea Hirlimann, Ahmad Dawar Hakimi, Mingyang Wang, Amir Hossein Kargaran, Sascha Rothe, François Yvon, Hinrich Schütze
TL;DR
This paper demonstrates the existence of relation-specific neurons (RelSpec) in decoder-only LLMs by applying a statistics-based identification method to LLama-2 models across 12 relations. By testing controlled generation through ablating top RelSpec neurons, it reveals that relational knowledge is distributed across multiple FFN neurons (cumulativity), that RelSpec neurons can be shared across related and even distant relations and languages (versatility), and that deactivating one relation’s neurons can sometimes improve others (interference). Across 7B/13B models, RelSpec neurons concentrate in middle layers and exhibit varying degrees of cross-relations and multilingual transfer, indicating a structured, modular organization of relational knowledge. These findings advance mechanistic interpretability by showing how relational facts are encoded and how targeted interventions affect recall and generalization, with practical implications for probing, debugging, and guiding generation in LLMs. The work further provides empirical evidence that relational knowledge is not localized to single neurons but distributed across circuits, enabling nuanced manipulation without catastrophic disruption to general language modeling.
Abstract
In large language models (LLMs), certain \emph{neurons} can store distinct pieces of knowledge learned during pretraining. While factual knowledge typically appears as a combination of \emph{relations} and \emph{entities}, it remains unclear whether some neurons focus on a relation itself -- independent of any entity. We hypothesize such neurons \emph{detect} a relation in the input text and \emph{guide} generation involving such a relation. To investigate this, we study the LLama-2 family on a chosen set of relations, with a \textit{statistics}-based method. Our experiments demonstrate the existence of relation-specific neurons. We measure the effect of selectively deactivating candidate neurons specific to relation $r$ on the LLM's ability to handle (1) facts involving relation $r$ and (2) facts involving a different relation $r' \neq r$. With respect to their capacity for encoding relation information, we give evidence for the following three properties of relation-specific neurons. \textbf{(i) Neuron cumulativity.} Multiple neurons jointly contribute to processing facts involving relation $r$, with no single neuron fully encoding a fact in $r$ on its own. \textbf{(ii) Neuron versatility.} Neurons can be shared across multiple closely related as well as less related relations. In addition, some relation neurons transfer across languages. \textbf{(iii) Neuron interference.} Deactivating neurons specific to one relation can improve LLMs' factual recall performance for facts of other relations. We make our code and data publicly available at https://github.com/cisnlp/relation-specific-neurons.
