Constraining Sequential Model Editing with Editing Anchor Compression
Hao-Xiang Xu, Jun-Yu Ma, Zhen-Hua Ling, Ningyu Zhang, Jia-Chen Gu
TL;DR
This work addresses the degradation of general abilities in sequential knowledge editing of large language models. It identifies that edits introduce non-trivial noise which causes the edited parameter matrix to deviate from the original semantic space, impairing downstream performance. The proposed Editing Anchor Compression framework selects editing anchors via a weighted-gradient saliency map and retrains only those dimensions using a scored elastic net loss, thereby constraining update norms and preserving general abilities while maintaining editing accuracy. Across multiple models, editing methods, and tasks, EAC achieves substantial preservation of general abilities (over 70%) and better retention of edited knowledge, offering a scalable approach to continual knowledge updates in LLMs.
Abstract
Large language models (LLMs) struggle with hallucinations due to false or outdated knowledge. Given the high resource demands of retraining these models, there is an increasing focus on developing model editing. However, the general abilities of LLMs across downstream tasks are prone to significant degradation during sequential editing. This paper statistically observes that the parameter matrix after editing exhibits a significant deviation compared to its previous state as the number of edits increases. This serious deviation affects the original knowledge associations within LLMs and leads to the degradation of their general abilities. To this end, a framework termed Editing Anchor Compression (EAC) is proposed to constrain the deviation of the parameter matrix during sequential editing. It compresses the editing information by selecting editing anchors that are important in encoding new relations without deviating too much from the original matrix, thereby preserving the general abilities. Experiments of applying EAC to two popular editing methods on three LLMs across four tasks are conducted. Evaluation results show that EAC effectively minimizes unreasonable deviations caused by model editing, preserving over 70% of the general abilities while better retaining the editing knowledge compared to the original counterpart methods.
