Table of Contents
Fetching ...

AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models

Junfeng Fang, Houcheng Jiang, Kun Wang, Yunshan Ma, Shi Jie, Xiang Wang, Xiangnan He, Tat-seng Chua

TL;DR

AlphaEdit addresses the persistent trade-off between updating knowledge and preserving existing knowledge in large language models by projecting parameter perturbations onto the null space of preserved knowledge. By removing the explicit preservation term and applying a null-space projection with a single, reusable projection matrix ${\mathbf{P}}$, AlphaEdit guarantees that updated outputs do not disrupt stored facts, mitigating forgetting and collapse in sequential edits. Empirical results across GPT-2 XL, GPT-J, and LLaMA3 show notable gains over state-of-the-art baselines, with average improvements in efficacy and generalization and robust maintenance of general capabilities even after extensive edits. The approach is lightweight and easily integrable as a plug-in to existing editing methods, offering practical benefits for scalable knowledge updates in real-world deployments.

Abstract

Large language models (LLMs) often exhibit hallucinations due to incorrect or outdated knowledge. Hence, model editing methods have emerged to enable targeted knowledge updates. To achieve this, a prevailing paradigm is the locating-then-editing approach, which first locates influential parameters and then edits them by introducing a perturbation. While effective, current studies have demonstrated that this perturbation inevitably disrupt the originally preserved knowledge within LLMs, especially in sequential editing scenarios. To address this, we introduce AlphaEdit, a novel solution that projects perturbation onto the null space of the preserved knowledge before applying it to the parameters. We theoretically prove that this projection ensures the output of post-edited LLMs remains unchanged when queried about the preserved knowledge, thereby mitigating the issue of disruption. Extensive experiments on various LLMs, including LLaMA3, GPT2-XL, and GPT-J, show that AlphaEdit boosts the performance of most locating-then-editing methods by an average of 36.7% with a single line of additional code for projection solely. Our code is available at: https://github.com/jianghoucheng/AlphaEdit.

AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models

TL;DR

AlphaEdit addresses the persistent trade-off between updating knowledge and preserving existing knowledge in large language models by projecting parameter perturbations onto the null space of preserved knowledge. By removing the explicit preservation term and applying a null-space projection with a single, reusable projection matrix , AlphaEdit guarantees that updated outputs do not disrupt stored facts, mitigating forgetting and collapse in sequential edits. Empirical results across GPT-2 XL, GPT-J, and LLaMA3 show notable gains over state-of-the-art baselines, with average improvements in efficacy and generalization and robust maintenance of general capabilities even after extensive edits. The approach is lightweight and easily integrable as a plug-in to existing editing methods, offering practical benefits for scalable knowledge updates in real-world deployments.

Abstract

Large language models (LLMs) often exhibit hallucinations due to incorrect or outdated knowledge. Hence, model editing methods have emerged to enable targeted knowledge updates. To achieve this, a prevailing paradigm is the locating-then-editing approach, which first locates influential parameters and then edits them by introducing a perturbation. While effective, current studies have demonstrated that this perturbation inevitably disrupt the originally preserved knowledge within LLMs, especially in sequential editing scenarios. To address this, we introduce AlphaEdit, a novel solution that projects perturbation onto the null space of the preserved knowledge before applying it to the parameters. We theoretically prove that this projection ensures the output of post-edited LLMs remains unchanged when queried about the preserved knowledge, thereby mitigating the issue of disruption. Extensive experiments on various LLMs, including LLaMA3, GPT2-XL, and GPT-J, show that AlphaEdit boosts the performance of most locating-then-editing methods by an average of 36.7% with a single line of additional code for projection solely. Our code is available at: https://github.com/jianghoucheng/AlphaEdit.
Paper Structure (44 sections, 28 equations, 13 figures, 7 tables)

This paper contains 44 sections, 28 equations, 13 figures, 7 tables.

Figures (13)

  • Figure 1: Comparison between the current methods and AlphaEdit. (a) and (d) exhibit the objectives, where $\lambda$ is the coefficient to keep balance between $e_0$ and $e_1$ in the objective; (b) and (e) show the distributions of hidden representations after dimensionality reduction within the pre-edited and post-edited LLaMA3, respectively; (c) depicts the output of the post-edited LLaMA3. Best viewed in color.
  • Figure 2: Performance of various model editing methods on LLaMA3 (8B). Results with asterisks in the superscript are from the ZsRE dataset. SST, RTE, and CoLA demonstrate the general capabilities of the post-edited LLMs. The experiments are conducted with 2000 edited samples for sequential editing. Detailed settings and results are provided in Section \ref{['sec:rq1']} and Table \ref{['tab:overall_comp']}, respectively. Best viewed in color.
  • Figure 3: Comparison between the paradigms of AlphaEdit and current method. Best viewed in color.
  • Figure 4: F1 scores of the post-edited LLaMA3 (8B) on six tasks (i.e., SST, MRPC, CoLA, RTE, MMLU and NLI) used for general capability testing. Best viewed in color.
  • Figure 5: The distribution of hidden representations of pre-edited and post-edited LLMs after dimensionality reduction. The top and right curve graphs display the marginal distributions for two reduced dimensions, where AlphaEdit consistently exhibits minimal shift. Best viewed in color.
  • ...and 8 more figures