Table of Contents
Fetching ...

Stable Knowledge Editing in Large Language Models

Zihao Wei, Liang Pang, Hanxing Ding, Jingcheng Deng, Huawei Shen, Xueqi Cheng

TL;DR

This paper tackles the instability of knowledge editing in large language models by challenging the localization assumption and introducing StableKE, a data-centric approach that uses semantic paraphrase enhancement (SPE) and contextual description enrichment (CDE) to augment knowledge descriptions and surrounding context. It pairs StableKE with KEBench, a tree-structured, multi-hop knowledge editing benchmark that evaluates four stability dimensions: edited knowledge, multi-hop reasoning, unrelated knowledge, and general capabilities. Empirical results show StableKE outperforms prior methods across large-scale batch and sequential edits, maintains unrelated knowledge better, and preserves multi-hop reasoning and instruction-following abilities, including on ChatGPT. The work also demonstrates that data quality and diverse augmentation are crucial for robust editing, while noting limitations in sequential edits and some multi-hop gaps that point to future improvements and broader applicability including non-public models via finetuning APIs.

Abstract

Efficient knowledge editing of large language models is crucial for replacing obsolete information or incorporating specialized knowledge on a large scale. However, previous methods implicitly assume that knowledge is localized and isolated within the model, an assumption that oversimplifies the interconnected nature of model knowledge. The premise of localization results in an incomplete knowledge editing, whereas an isolated assumption may impair both other knowledge and general abilities. It introduces instability to the performance of the knowledge editing method. To transcend these assumptions, we introduce StableKE, a method adopts a novel perspective based on knowledge augmentation rather than knowledge localization. To overcome the expense of human labeling, StableKE integrates two automated knowledge augmentation strategies: Semantic Paraphrase Enhancement strategy, which diversifies knowledge descriptions to facilitate the teaching of new information to the model, and Contextual Description Enrichment strategy, expanding the surrounding knowledge to prevent the forgetting of related information. StableKE surpasses other knowledge editing methods, demonstrating stability both edited knowledge and multi-hop knowledge, while also preserving unrelated knowledge and general abilities. Moreover, StableKE can edit knowledge on ChatGPT.

Stable Knowledge Editing in Large Language Models

TL;DR

This paper tackles the instability of knowledge editing in large language models by challenging the localization assumption and introducing StableKE, a data-centric approach that uses semantic paraphrase enhancement (SPE) and contextual description enrichment (CDE) to augment knowledge descriptions and surrounding context. It pairs StableKE with KEBench, a tree-structured, multi-hop knowledge editing benchmark that evaluates four stability dimensions: edited knowledge, multi-hop reasoning, unrelated knowledge, and general capabilities. Empirical results show StableKE outperforms prior methods across large-scale batch and sequential edits, maintains unrelated knowledge better, and preserves multi-hop reasoning and instruction-following abilities, including on ChatGPT. The work also demonstrates that data quality and diverse augmentation are crucial for robust editing, while noting limitations in sequential edits and some multi-hop gaps that point to future improvements and broader applicability including non-public models via finetuning APIs.

Abstract

Efficient knowledge editing of large language models is crucial for replacing obsolete information or incorporating specialized knowledge on a large scale. However, previous methods implicitly assume that knowledge is localized and isolated within the model, an assumption that oversimplifies the interconnected nature of model knowledge. The premise of localization results in an incomplete knowledge editing, whereas an isolated assumption may impair both other knowledge and general abilities. It introduces instability to the performance of the knowledge editing method. To transcend these assumptions, we introduce StableKE, a method adopts a novel perspective based on knowledge augmentation rather than knowledge localization. To overcome the expense of human labeling, StableKE integrates two automated knowledge augmentation strategies: Semantic Paraphrase Enhancement strategy, which diversifies knowledge descriptions to facilitate the teaching of new information to the model, and Contextual Description Enrichment strategy, expanding the surrounding knowledge to prevent the forgetting of related information. StableKE surpasses other knowledge editing methods, demonstrating stability both edited knowledge and multi-hop knowledge, while also preserving unrelated knowledge and general abilities. Moreover, StableKE can edit knowledge on ChatGPT.
Paper Structure (28 sections, 9 equations, 5 figures, 13 tables)

This paper contains 28 sections, 9 equations, 5 figures, 13 tables.

Figures (5)

  • Figure 1: To enhance the evaluation of a knowledge editing method, we propose assessing it across four dimensions of stability. (1) Edited Knowledge Stability reflects the performance of one-hop knowledge editing, focusing on the consistency and accuracy of edited knowledge. (2) Multi-hop Knowledge Stability evaluates how well the edited knowledge integrates with existing knowledge across multiple steps. (3) Unrelated Knowledge Stability and (4) General Ability Stability, ensures that unrelated knowledge remains unchanged and maintain the overall capabilities of the model despite the editing process.
  • Figure 2: An example of our KEBench.
  • Figure 3: Impact of batch size $N_{batch}$ on MEMIT and StableKE performance across four stability aspects.
  • Figure 4: Impact of sequential size $N_{seq}$ on MEMIT and StableKE performance across four stability aspects.
  • Figure 5: Impact of semantic paraphrase quantity on StableKE performance in Vicuna-7b and Vicuna-13B.