Table of Contents
Fetching ...

Constraining Sequential Model Editing with Editing Anchor Compression

Hao-Xiang Xu, Jun-Yu Ma, Zhen-Hua Ling, Ningyu Zhang, Jia-Chen Gu

TL;DR

This work addresses the degradation of general abilities in sequential knowledge editing of large language models. It identifies that edits introduce non-trivial noise which causes the edited parameter matrix to deviate from the original semantic space, impairing downstream performance. The proposed Editing Anchor Compression framework selects editing anchors via a weighted-gradient saliency map and retrains only those dimensions using a scored elastic net loss, thereby constraining update norms and preserving general abilities while maintaining editing accuracy. Across multiple models, editing methods, and tasks, EAC achieves substantial preservation of general abilities (over 70%) and better retention of edited knowledge, offering a scalable approach to continual knowledge updates in LLMs.

Abstract

Large language models (LLMs) struggle with hallucinations due to false or outdated knowledge. Given the high resource demands of retraining these models, there is an increasing focus on developing model editing. However, the general abilities of LLMs across downstream tasks are prone to significant degradation during sequential editing. This paper statistically observes that the parameter matrix after editing exhibits a significant deviation compared to its previous state as the number of edits increases. This serious deviation affects the original knowledge associations within LLMs and leads to the degradation of their general abilities. To this end, a framework termed Editing Anchor Compression (EAC) is proposed to constrain the deviation of the parameter matrix during sequential editing. It compresses the editing information by selecting editing anchors that are important in encoding new relations without deviating too much from the original matrix, thereby preserving the general abilities. Experiments of applying EAC to two popular editing methods on three LLMs across four tasks are conducted. Evaluation results show that EAC effectively minimizes unreasonable deviations caused by model editing, preserving over 70% of the general abilities while better retaining the editing knowledge compared to the original counterpart methods.

Constraining Sequential Model Editing with Editing Anchor Compression

TL;DR

This work addresses the degradation of general abilities in sequential knowledge editing of large language models. It identifies that edits introduce non-trivial noise which causes the edited parameter matrix to deviate from the original semantic space, impairing downstream performance. The proposed Editing Anchor Compression framework selects editing anchors via a weighted-gradient saliency map and retrains only those dimensions using a scored elastic net loss, thereby constraining update norms and preserving general abilities while maintaining editing accuracy. Across multiple models, editing methods, and tasks, EAC achieves substantial preservation of general abilities (over 70%) and better retention of edited knowledge, offering a scalable approach to continual knowledge updates in LLMs.

Abstract

Large language models (LLMs) struggle with hallucinations due to false or outdated knowledge. Given the high resource demands of retraining these models, there is an increasing focus on developing model editing. However, the general abilities of LLMs across downstream tasks are prone to significant degradation during sequential editing. This paper statistically observes that the parameter matrix after editing exhibits a significant deviation compared to its previous state as the number of edits increases. This serious deviation affects the original knowledge associations within LLMs and leads to the degradation of their general abilities. To this end, a framework termed Editing Anchor Compression (EAC) is proposed to constrain the deviation of the parameter matrix during sequential editing. It compresses the editing information by selecting editing anchors that are important in encoding new relations without deviating too much from the original matrix, thereby preserving the general abilities. Experiments of applying EAC to two popular editing methods on three LLMs across four tasks are conducted. Evaluation results show that EAC effectively minimizes unreasonable deviations caused by model editing, preserving over 70% of the general abilities while better retaining the editing knowledge compared to the original counterpart methods.

Paper Structure

This paper contains 37 sections, 13 equations, 19 figures, 4 tables.

Figures (19)

  • Figure 1: (a) Comparison of regular model editing and EAC. EAC compresses the editing information into the dimensions where the editing anchors are located. Here, we utilize the gradients generated during training and the magnitude of the updated knowledge vector to identify anchors. (b) Comparison of general downstream task performance before editing, after regular editing, and after constrained editing by EAC.
  • Figure 2: Illustration of the change of L1 norm (a) in sequential editing at the edited layer using editing-based methods and (b) in fine-tuning different batch steps for selected layers. Here we uniformly selected the layers of GPT2-XL for clarity when fine-tuning.
  • Figure 3: Visialization of six sets of facts recalled by LLMs using 2-dimensional PCA. Note that this hidden state is also projected by a language modeling head (linear mapping) for next-token prediction, implying the linear structure in the corresponding representation space (the PCA assumption).
  • Figure 4: Proposed method: EAC. We first identify the key dimensions of the editing anchors using a weighted-gradient saliency map, followed by retraining on these dimensions to achieve the final optimization.
  • Figure 5: Edited on the ZsRE dataset, the sequential editing performance of ROME and MEMIT with GPT2-XL and LLaMA-3 (8B) before and after the introduction of EAC, as the number of edits increases.
  • ...and 14 more figures