PMET: Precise Model Editing in a Transformer

Xiaopeng Li; Shasha Li; Shezheng Song; Jing Yang; Jun Ma; Jie Yu

PMET: Precise Model Editing in a Transformer

Xiaopeng Li, Shasha Li, Shezheng Song, Jing Yang, Jun Ma, Jie Yu

TL;DR

This work interrogates how knowledge is stored and recalled in transformer-based models, revealing that MHSA tends to extract general patterns while FFN stores factual knowledge. Building on this, PMET jointly optimizes MHSA and FFN hidden states but restricts weight updates to FFN, enabling precise and robust knowledge edits. Empirical results on GPT-J and GPT-NeoX demonstrate state-of-the-art reliability and competitive specificity across zsRE and CounterFact, with ablations clarifying the roles of MHSA optimization and the residual-spread strategy. The findings offer a nuanced view of transformer internals for targeted editing and present PMET as a practical method for high-fidelity knowledge modification.

Abstract

Model editing techniques modify a minor proportion of knowledge in Large Language Models (LLMs) at a relatively low cost, which have demonstrated notable success. Existing methods assume Transformer Layer (TL) hidden states are values of key-value memories of the Feed-Forward Network (FFN). They usually optimize the TL hidden states to memorize target knowledge and use it to update the weights of the FFN in LLMs. However, the information flow of TL hidden states comes from three parts: Multi-Head Self-Attention (MHSA), FFN, and residual connections. Existing methods neglect the fact that the TL hidden states contains information not specifically required for FFN. Consequently, the performance of model editing decreases. To achieve more precise model editing, we analyze hidden states of MHSA and FFN, finding that MHSA encodes certain general knowledge extraction patterns. This implies that MHSA weights do not require updating when new knowledge is introduced. Based on above findings, we introduce PMET, which simultaneously optimizes Transformer Component (TC, namely MHSA and FFN) hidden states, while only using the optimized TC hidden states of FFN to precisely update FFN weights. Our experiments demonstrate that PMET exhibits state-of-the-art performance on both the COUNTERFACT and zsRE datasets. Our ablation experiments substantiate the effectiveness of our enhancements, further reinforcing the finding that the MHSA encodes certain general knowledge extraction patterns and indicating its storage of a small amount of factual knowledge. Our code is available at https://github.com/xpq-tech/PMET.

PMET: Precise Model Editing in a Transformer

TL;DR

Abstract

Paper Structure (25 sections, 13 equations, 5 figures, 3 tables)

This paper contains 25 sections, 13 equations, 5 figures, 3 tables.

Introduction
Related Work
Model Editing
Post-Hoc Explanation of Transformers
Methodology
Preliminaries
Language Modeling
Model Editing Problem
Investigating the Role of MHSA and FFN in LLMs' Knowledge Recall
PMET Method
Experiments
Baselines and Datasets
Editing Experiments
Editing Knowledge in Counterfact
Editing 10K Knowledge in ZsRE
...and 10 more sections

Figures (5)

Figure 1: Comparison between PMET and existing methods in a Transformer layer. (a) Existing optimization-based methods employ optimized TL hidden states to perform vague updates on FFN weights. (b) PMET simultaneously optimizes the TC hidden states of both MHSA and FFN, but only uses the optimized TC hidden states of FFN to perform precise updates on FFN weights.
Figure 2: The changes in the average cosine similarity and average Jaccard similarity of the hidden states before and after MHSA and FFN.
Figure 3: The editing performance of PMET and baselines varies with the number of edits (X-axis).
Figure 4: The norm changes of the incremental weight $\Delta$ in the ablation experiment.
Figure 5: A sample of the zsRE dataset

PMET: Precise Model Editing in a Transformer

TL;DR

Abstract

PMET: Precise Model Editing in a Transformer

Authors

TL;DR

Abstract

Table of Contents

Figures (5)