Table of Contents
Fetching ...

WilKE: Wise-Layer Knowledge Editor for Lifelong Knowledge Editing

Chenhui Hu, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao

TL;DR

This paper tackles the challenge of lifelong knowledge editing in large language models by identifying pattern unmatch as the root cause of toxicity buildup and toxicity flash during continual edits. It introduces WilKE, a Wise-Layer Knowledge Editor that abandons predefined editing layers in favor of pattern-matching-based layer selection across the model, and demonstrates through experiments on GPT2-XL and GPT-J that WilKE yields substantial gains over state-of-the-art methods in lifelong editing. The work highlights that current methods suffer from side effects that degrade broader model behavior, and shows how cross-layer pattern alignment can mitigate these effects while maintaining edit effectiveness. The findings have practical implications for scalable, continual knowledge updating in LLMs, with considerations for safety, generalization, and deployment in real-world systems.

Abstract

Knowledge editing aims to rectify inaccuracies in large language models (LLMs) without costly retraining for outdated or erroneous knowledge. However, current knowledge editing methods primarily focus on single editing, failing to meet the requirements for lifelong editing. This study reveals a performance degradation encountered by knowledge editing in lifelong editing, characterized by toxicity buildup and toxicity flash, with the primary cause identified as pattern unmatch. We introduce a knowledge editing approach named Wise-Layer Knowledge Editor (WilKE), which selects editing layer based on the pattern matching degree of editing knowledge across different layers in language models. Experimental results demonstrate that, in lifelong editing, WilKE exhibits an average improvement of 46.2% and 67.8% on editing GPT2-XL and GPT-J relative to state-of-the-art knowledge editing methods.

WilKE: Wise-Layer Knowledge Editor for Lifelong Knowledge Editing

TL;DR

This paper tackles the challenge of lifelong knowledge editing in large language models by identifying pattern unmatch as the root cause of toxicity buildup and toxicity flash during continual edits. It introduces WilKE, a Wise-Layer Knowledge Editor that abandons predefined editing layers in favor of pattern-matching-based layer selection across the model, and demonstrates through experiments on GPT2-XL and GPT-J that WilKE yields substantial gains over state-of-the-art methods in lifelong editing. The work highlights that current methods suffer from side effects that degrade broader model behavior, and shows how cross-layer pattern alignment can mitigate these effects while maintaining edit effectiveness. The findings have practical implications for scalable, continual knowledge updating in LLMs, with considerations for safety, generalization, and deployment in real-world systems.

Abstract

Knowledge editing aims to rectify inaccuracies in large language models (LLMs) without costly retraining for outdated or erroneous knowledge. However, current knowledge editing methods primarily focus on single editing, failing to meet the requirements for lifelong editing. This study reveals a performance degradation encountered by knowledge editing in lifelong editing, characterized by toxicity buildup and toxicity flash, with the primary cause identified as pattern unmatch. We introduce a knowledge editing approach named Wise-Layer Knowledge Editor (WilKE), which selects editing layer based on the pattern matching degree of editing knowledge across different layers in language models. Experimental results demonstrate that, in lifelong editing, WilKE exhibits an average improvement of 46.2% and 67.8% on editing GPT2-XL and GPT-J relative to state-of-the-art knowledge editing methods.
Paper Structure (29 sections, 11 equations, 43 figures, 4 tables)

This paper contains 29 sections, 11 equations, 43 figures, 4 tables.

Figures (43)

  • Figure 1: Single editing versus lifelong edit. (a) Single editing only involves making an edit. (b) Life-long editing involves continuous edits and monitoring performance.
  • Figure 2: Illustration of our work. Predefined editing layers may not necessarily accommodate all editing knowledge effectively. Therefore, it would be wiser to select different editing layers for different editing knowledge.
  • Figure 3: The toxicity on GPT2-XL with editing steps.
  • Figure 4: The toxicity on GPT-J with editing steps.
  • Figure 5: Toxicity buildup on GPT2-XL with editing steps.
  • ...and 38 more figures