Resolving Editing-Unlearning Conflicts: A Knowledge Codebook Framework for Large Language Model Updating

Binchi Zhang; Zhengzhang Chen; Zaiyi Zheng; Jundong Li; Haifeng Chen

Resolving Editing-Unlearning Conflicts: A Knowledge Codebook Framework for Large Language Model Updating

Binchi Zhang, Zhengzhang Chen, Zaiyi Zheng, Jundong Li, Haifeng Chen

TL;DR

Updating large language models requires simultaneous unlearning of outdated information and editing of new knowledge, which often incurs gradient conflicts and inefficient memory use. The paper introduces LOKA, a conflict-free updating framework built on a knowledge codebook of multiple memories, a similarity-aware mapping to cluster related knowledge, a conflict-score mechanism to choose between task-specific and multi-task memories, and a learning-based router to gate codebook usage at inference, with scalable sequential updating via Locality-Sensitive Hashing. The authors provide a theoretical analysis of editing-unlearning conflicts and demonstrate, across TOFU, PKU-SafeRLHF, and ZsRE benchmarks, that LOKA achieves superior unlearning and editing performance while preserving remaining knowledge, outperforming traditional fine-tuning and memory-based baselines. The work emphasizes practical impact by enabling safer, up-to-date, and customizable knowledge management in LLMs, including sequential updates and potential for personalized deployments.

Abstract

Large Language Models (LLMs) excel in natural language processing by encoding extensive human knowledge, but their utility relies on timely updates as knowledge evolves. Updating LLMs involves two key tasks simultaneously: unlearning to remove unwanted knowledge and editing to incorporate new information. Existing methods face two major challenges: ineffective knowledge storage (either too sparse or too dense) and task conflicts between editing and unlearning, as validated through our theoretical and experimental results. To address these issues, we propose LOKA, a conflict-free framework for LLM updating based on a knowledge codebook. During training, updated knowledge is stored in multiple codebook memories. To optimize knowledge storage, a similarity-aware knowledge mapping ensures that related knowledge pieces are clustered and allocated to the same memory. Additionally, LOKA resolves task conflicts by employing task-specific and multi-task memories guided by a conflict score. In the inference stage, LOKA retrieves the most relevant memory from the codebook and plugs it into the original LLM to apply the updated knowledge. A learning-based router controls codebook activation to further improve knowledge utilization. Extensive experiments demonstrate the effectiveness of LOKA in LLM knowledge updating tasks.

Resolving Editing-Unlearning Conflicts: A Knowledge Codebook Framework for Large Language Model Updating

TL;DR

Abstract

Resolving Editing-Unlearning Conflicts: A Knowledge Codebook Framework for Large Language Model Updating

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (5)