Table of Contents
Fetching ...

WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

Peng Wang, Zexi Li, Ningyu Zhang, Ziwen Xu, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

TL;DR

This work tackles the challenge of updating knowledge in large language models without sacrificing previously learned information or causing unintended changes. It identifies an impossible triangle among reliability, generalization, and locality when editing either long-term or working memory. To address this, it introduces WISE, a dual-memory framework with a main memory for pretrained knowledge and a side memory for edited knowledge, coupled with a router, knowledge sharding, and a memory-merge procedure (Ties-Merge) to support continual edits. Across QA, hallucination, and OOD tasks on GPT-family, LLaMA, and Mistral models, WISE consistently outperforms baselines, scales to thousands of edits, and maintains locality while improving generalization, signaling a practical path for lifelong model editing.

Abstract

Large language models (LLMs) need knowledge updates to meet the ever-growing world facts and correct the hallucinated responses, facilitating the methods of lifelong model editing. Where the updated knowledge resides in memories is a fundamental question for model editing. In this paper, we find that editing either long-term memory (direct model parameters) or working memory (non-parametric knowledge of neural network activations/representations by retrieval) will result in an impossible triangle -- reliability, generalization, and locality can not be realized together in the lifelong editing settings. For long-term memory, directly editing the parameters will cause conflicts with irrelevant pretrained knowledge or previous edits (poor reliability and locality). For working memory, retrieval-based activations can hardly make the model understand the edits and generalize (poor generalization). Therefore, we propose WISE to bridge the gap between memories. In WISE, we design a dual parametric memory scheme, which consists of the main memory for the pretrained knowledge and a side memory for the edited knowledge. We only edit the knowledge in the side memory and train a router to decide which memory to go through when given a query. For continual editing, we devise a knowledge-sharding mechanism where different sets of edits reside in distinct subspaces of parameters, and are subsequently merged into a shared memory without conflicts. Extensive experiments show that WISE can outperform previous model editing methods and overcome the impossible triangle under lifelong model editing of question answering, hallucination, and out-of-distribution settings across trending LLM architectures, e.g., GPT, LLaMA, and Mistral. Code is available at https://github.com/zjunlp/EasyEdit.

WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

TL;DR

This work tackles the challenge of updating knowledge in large language models without sacrificing previously learned information or causing unintended changes. It identifies an impossible triangle among reliability, generalization, and locality when editing either long-term or working memory. To address this, it introduces WISE, a dual-memory framework with a main memory for pretrained knowledge and a side memory for edited knowledge, coupled with a router, knowledge sharding, and a memory-merge procedure (Ties-Merge) to support continual edits. Across QA, hallucination, and OOD tasks on GPT-family, LLaMA, and Mistral models, WISE consistently outperforms baselines, scales to thousands of edits, and maintains locality while improving generalization, signaling a practical path for lifelong model editing.

Abstract

Large language models (LLMs) need knowledge updates to meet the ever-growing world facts and correct the hallucinated responses, facilitating the methods of lifelong model editing. Where the updated knowledge resides in memories is a fundamental question for model editing. In this paper, we find that editing either long-term memory (direct model parameters) or working memory (non-parametric knowledge of neural network activations/representations by retrieval) will result in an impossible triangle -- reliability, generalization, and locality can not be realized together in the lifelong editing settings. For long-term memory, directly editing the parameters will cause conflicts with irrelevant pretrained knowledge or previous edits (poor reliability and locality). For working memory, retrieval-based activations can hardly make the model understand the edits and generalize (poor generalization). Therefore, we propose WISE to bridge the gap between memories. In WISE, we design a dual parametric memory scheme, which consists of the main memory for the pretrained knowledge and a side memory for the edited knowledge. We only edit the knowledge in the side memory and train a router to decide which memory to go through when given a query. For continual editing, we devise a knowledge-sharding mechanism where different sets of edits reside in distinct subspaces of parameters, and are subsequently merged into a shared memory without conflicts. Extensive experiments show that WISE can outperform previous model editing methods and overcome the impossible triangle under lifelong model editing of question answering, hallucination, and out-of-distribution settings across trending LLM architectures, e.g., GPT, LLaMA, and Mistral. Code is available at https://github.com/zjunlp/EasyEdit.
Paper Structure (55 sections, 2 theorems, 11 equations, 13 figures, 13 tables, 2 algorithms)

This paper contains 55 sections, 2 theorems, 11 equations, 13 figures, 13 tables, 2 algorithms.

Key Result

Theorem 2.1

Subspace Overlap. Generate $k$ memory subspaces $\mathbf{W}_{v'}^{i}, i \in [k]$ by random mask with 1's ratio $\rho$, so each memory has $\rho\cdot|\mathbf{W}_{v'}|$ active trained parameters. For any two subspaces $\mathbf{W}_{v'}^{i}$ and $\mathbf{W}_{v'}^{j}$$i\neq j ;i,j \in [k]$, there are $\r

Figures (13)

  • Figure 1: Metric triangle among reliability, generalization, and locality. ZsRE dataset, number of continual edits $T=100$, LLaMA-2-7B. Editing methods based on long-term memory (ROME and FT-EWC) and working memory (DEFER and GRACE) show the impossible triangle in metrics, while our WISE is leading in all three metrics.
  • Figure 2: Overview of WISE. Side memory (in blue) and main memory (in green) store edited and pretrained knowledge, respectively. Note: during inference, if WISE-Retrieve, the activation routing will retrieve and select one side memory with maximal activation score.
  • Figure 3: Activations of the memory routing module of WISE when varying $T$.X-axis: Num edits. LLaMA-7B.
  • Figure 4: Analysis of locating FFN layer of side memory for WISE. ZsRE, LLaMA-2-7B.
  • Figure 5: Analysis of different mask ratios $\rho$ and subspaces $k$ for WISE. Left: Avg. performance of Rel., Gen., and Loc.; Right: the subspace overlap probability in Theorem \ref{['theorem:subspace_overlap']}. ZsRE, LLaMA-2-7B.
  • ...and 8 more figures

Theorems & Definitions (2)

  • Theorem 2.1
  • Theorem C.1