Table of Contents
Fetching ...

Resolving Editing-Unlearning Conflicts: A Knowledge Codebook Framework for Large Language Model Updating

Binchi Zhang, Zhengzhang Chen, Zaiyi Zheng, Jundong Li, Haifeng Chen

TL;DR

Updating large language models requires simultaneous unlearning of outdated information and editing of new knowledge, which often incurs gradient conflicts and inefficient memory use. The paper introduces LOKA, a conflict-free updating framework built on a knowledge codebook of multiple memories, a similarity-aware mapping to cluster related knowledge, a conflict-score mechanism to choose between task-specific and multi-task memories, and a learning-based router to gate codebook usage at inference, with scalable sequential updating via Locality-Sensitive Hashing. The authors provide a theoretical analysis of editing-unlearning conflicts and demonstrate, across TOFU, PKU-SafeRLHF, and ZsRE benchmarks, that LOKA achieves superior unlearning and editing performance while preserving remaining knowledge, outperforming traditional fine-tuning and memory-based baselines. The work emphasizes practical impact by enabling safer, up-to-date, and customizable knowledge management in LLMs, including sequential updates and potential for personalized deployments.

Abstract

Large Language Models (LLMs) excel in natural language processing by encoding extensive human knowledge, but their utility relies on timely updates as knowledge evolves. Updating LLMs involves two key tasks simultaneously: unlearning to remove unwanted knowledge and editing to incorporate new information. Existing methods face two major challenges: ineffective knowledge storage (either too sparse or too dense) and task conflicts between editing and unlearning, as validated through our theoretical and experimental results. To address these issues, we propose LOKA, a conflict-free framework for LLM updating based on a knowledge codebook. During training, updated knowledge is stored in multiple codebook memories. To optimize knowledge storage, a similarity-aware knowledge mapping ensures that related knowledge pieces are clustered and allocated to the same memory. Additionally, LOKA resolves task conflicts by employing task-specific and multi-task memories guided by a conflict score. In the inference stage, LOKA retrieves the most relevant memory from the codebook and plugs it into the original LLM to apply the updated knowledge. A learning-based router controls codebook activation to further improve knowledge utilization. Extensive experiments demonstrate the effectiveness of LOKA in LLM knowledge updating tasks.

Resolving Editing-Unlearning Conflicts: A Knowledge Codebook Framework for Large Language Model Updating

TL;DR

Updating large language models requires simultaneous unlearning of outdated information and editing of new knowledge, which often incurs gradient conflicts and inefficient memory use. The paper introduces LOKA, a conflict-free updating framework built on a knowledge codebook of multiple memories, a similarity-aware mapping to cluster related knowledge, a conflict-score mechanism to choose between task-specific and multi-task memories, and a learning-based router to gate codebook usage at inference, with scalable sequential updating via Locality-Sensitive Hashing. The authors provide a theoretical analysis of editing-unlearning conflicts and demonstrate, across TOFU, PKU-SafeRLHF, and ZsRE benchmarks, that LOKA achieves superior unlearning and editing performance while preserving remaining knowledge, outperforming traditional fine-tuning and memory-based baselines. The work emphasizes practical impact by enabling safer, up-to-date, and customizable knowledge management in LLMs, including sequential updates and potential for personalized deployments.

Abstract

Large Language Models (LLMs) excel in natural language processing by encoding extensive human knowledge, but their utility relies on timely updates as knowledge evolves. Updating LLMs involves two key tasks simultaneously: unlearning to remove unwanted knowledge and editing to incorporate new information. Existing methods face two major challenges: ineffective knowledge storage (either too sparse or too dense) and task conflicts between editing and unlearning, as validated through our theoretical and experimental results. To address these issues, we propose LOKA, a conflict-free framework for LLM updating based on a knowledge codebook. During training, updated knowledge is stored in multiple codebook memories. To optimize knowledge storage, a similarity-aware knowledge mapping ensures that related knowledge pieces are clustered and allocated to the same memory. Additionally, LOKA resolves task conflicts by employing task-specific and multi-task memories guided by a conflict score. In the inference stage, LOKA retrieves the most relevant memory from the codebook and plugs it into the original LLM to apply the updated knowledge. A learning-based router controls codebook activation to further improve knowledge utilization. Extensive experiments demonstrate the effectiveness of LOKA in LLM knowledge updating tasks.

Paper Structure

This paper contains 34 sections, 2 theorems, 8 equations, 6 figures, 13 tables.

Key Result

Theorem 2.3

Define $\mathcal{L}_e$ and $\mathcal{L}_u$ as in eq:edit_loss and eq:unlearn_loss. When asp:lipschitz holds, if $d_{TV}(\mathcal{D}_e,\mathcal{D}_u)\leq\frac{1}{2C}\max\{\|\nabla\mathcal{L}_e\|,\|\nabla\mathcal{L}_u\|\}$, conflicts exist between $\mathcal{L}_e$ and $\mathcal{L}_u$, i.e., $\nabla\mat

Figures (6)

  • Figure 1: Histogram of the cosine similarity between editing and unlearning task gradients in one epoch (left for TOFU in-profile dataset and right for TOFU out-profile dataset).
  • Figure 2: Illustration of two types of knowledge storage methods, including their rationale and performance.
  • Figure 3: Overview of $\textnormal{LOKA}$ Framework. During training, knowledge pieces are encoded into the codebook memories for conflict-free learning. During inference, the memory most relevant to the input is retrieved and integrated with the original LLM to enhance responses.
  • Figure 4: Experimental results of sequential knowledge updating.
  • Figure 5: Illustration of the relationship between four subsets to evaluate LLM knowledge updating.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Definition 2.1
  • Theorem 2.3
  • Proposition 2.4
  • proof
  • proof