Table of Contents
Fetching ...

Related Knowledge Perturbation Matters: Rethinking Multiple Pieces of Knowledge Editing in Same-Subject

Zenghao Duan, Wenbin Duan, Zhiyi Yin, Yinghan Shen, Shaoling Jing, Jie Zhang, Huawei Shen, Xueqi Cheng

TL;DR

This work tackles updating entity-centric knowledge in LLMs by formalizing Same-Subject Editing and introducing the S^2RKE benchmark to study edits across multiple related facts for a single subject. It reveals a phenomenon called related knowledge perturbation in popular locate-then-edit methods, where subsequent edits interfere with earlier ones due to over-reliance on subject-derived keys in the MLP downsampling layer, evidenced by a high cosine similarity between keys and reduced Efficacy Success for the first edit. The findings are demonstrated across multiple models (e.g., GPT-2 XL, GPT-J, LLaMA-2-7B) and editing methods, highlighting a significant gap in current approaches and motivating the search for editing strategies that decouple edits from single-subject cues. The work underlines the practical impact of robust same-subject editing for coherent, multi-attribute knowledge updates in dynamic real-world information. All mathematical relations are presented with $...$ delimiters to ensure precise, machine-readable encoding of the concepts involved, such as $k_* = rac{1}{N} \,\sum_{i=1}^N \mathcal{K}(x_i \oplus p)$.

Abstract

Knowledge editing has become a promising approach for efficiently and precisely updating knowledge embedded in large language models (LLMs). In this work, we focus on Same-Subject Editing, which involves modifying multiple attributes of a single entity to ensure comprehensive and consistent updates to entity-centric knowledge. Through preliminary observation, we identify a significant challenge: Current state-of-the-art editing methods struggle when tasked with editing multiple related knowledge pieces for the same subject. To address the lack of relevant editing data for identical subjects in traditional benchmarks, we introduce the $\text{S}^2\text{RKE}$(Same-Subject Related Knowledge Editing) benchmark. Our extensive experiments reveal that only mainstream locate-then-edit methods, such as ROME and MEMIT, exhibit "related knowledge perturbation," where subsequent edits interfere with earlier ones. Further analysis reveals that these methods over-rely on subject information, neglecting other critical factors, resulting in reduced editing effectiveness.

Related Knowledge Perturbation Matters: Rethinking Multiple Pieces of Knowledge Editing in Same-Subject

TL;DR

This work tackles updating entity-centric knowledge in LLMs by formalizing Same-Subject Editing and introducing the S^2RKE benchmark to study edits across multiple related facts for a single subject. It reveals a phenomenon called related knowledge perturbation in popular locate-then-edit methods, where subsequent edits interfere with earlier ones due to over-reliance on subject-derived keys in the MLP downsampling layer, evidenced by a high cosine similarity between keys and reduced Efficacy Success for the first edit. The findings are demonstrated across multiple models (e.g., GPT-2 XL, GPT-J, LLaMA-2-7B) and editing methods, highlighting a significant gap in current approaches and motivating the search for editing strategies that decouple edits from single-subject cues. The work underlines the practical impact of robust same-subject editing for coherent, multi-attribute knowledge updates in dynamic real-world information. All mathematical relations are presented with delimiters to ensure precise, machine-readable encoding of the concepts involved, such as .

Abstract

Knowledge editing has become a promising approach for efficiently and precisely updating knowledge embedded in large language models (LLMs). In this work, we focus on Same-Subject Editing, which involves modifying multiple attributes of a single entity to ensure comprehensive and consistent updates to entity-centric knowledge. Through preliminary observation, we identify a significant challenge: Current state-of-the-art editing methods struggle when tasked with editing multiple related knowledge pieces for the same subject. To address the lack of relevant editing data for identical subjects in traditional benchmarks, we introduce the (Same-Subject Related Knowledge Editing) benchmark. Our extensive experiments reveal that only mainstream locate-then-edit methods, such as ROME and MEMIT, exhibit "related knowledge perturbation," where subsequent edits interfere with earlier ones. Further analysis reveals that these methods over-rely on subject information, neglecting other critical factors, resulting in reduced editing effectiveness.

Paper Structure

This paper contains 25 sections, 4 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Comparison of performance on Different and Same-Subject Editing. (a) Editing individual knowledge pieces for distinct subjects, "James" and "Messi," results in excellent performance. (b) Editing two related knowledge pieces for the same subject, "James," leads to poor performance.
  • Figure 2: The results of sequential-editing by three different schemes on GPT-J using MEMIT, comparing five evaluation metrics. The values of Score(S), Efficacy Success(ES) and Paraphrase Success(PS) always decreased with the subject density, but Neighborhood Success(NS) and Perplexity(PPL) remained unchanged.
  • Figure 3: The results of differences in sequential-editing results in two scenarios on three LLMs by six editing methods. Score Difference (SD) represents the difference in editing performance between the two experimental schemes when editing the same amount of knowledge under the same method.
  • Figure 4: The results of sequential-editing on GPT-2 XL and GPT-J using mainstream locate-then-edit methods. The bars represent the Score (S) of two strategies, and the line represents the Score Difference (SD) between the two strategies.
  • Figure 5: Illustration of related knowledge perturbation in same-subject editing.
  • ...and 4 more figures