Table of Contents
Fetching ...

AnyEdit: Edit Any Knowledge Encoded in Language Models

Houcheng Jiang, Junfeng Fang, Ningyu Zhang, Guojun Ma, Mingyang Wan, Xiang Wang, Xiangnan He, Tat-seng Chua

TL;DR

AnyEdit introduces an autoregressive, chunk-based knowledge editing framework to update long-form and diverse-formatted knowledge encoded in large language models. Grounded in the chain rule of mutual information, it sequentially edits chunks to avoid interference and scale to arbitrary lengths, while remaining plug-and-play with existing editing methods. Empirically, AnyEdit achieves about a 21.5% average improvement on long-form and diverse-form benchmarks and demonstrates strong generalization across domains, with insights into chunk sizing for practical deployment. The work also contributes the EditEverything dataset and shows that chunk-based autoregressive editing significantly broadens the scope and practicality of knowledge editing in LLMs.

Abstract

Large language models (LLMs) often produce incorrect or outdated information, necessitating efficient and precise knowledge updates. Current model editing methods, however, struggle with long-form knowledge in diverse formats, such as poetry, code snippets, and mathematical derivations. These limitations arise from their reliance on editing a single token's hidden state, a limitation we term "efficacy barrier". To solve this, we propose AnyEdit, a new autoregressive editing paradigm. It decomposes long-form knowledge into sequential chunks and iteratively edits the key token in each chunk, ensuring consistent and accurate outputs. Theoretically, we ground AnyEdit in the Chain Rule of Mutual Information, showing its ability to update any knowledge within LLMs. Empirically, it outperforms strong baselines by 21.5% on benchmarks including UnKEBench, AKEW, and our new EditEverything dataset for long-form diverse-formatted knowledge. Additionally, AnyEdit serves as a plug-and-play framework, enabling current editing methods to update knowledge with arbitrary length and format, significantly advancing the scope and practicality of LLM knowledge editing.

AnyEdit: Edit Any Knowledge Encoded in Language Models

TL;DR

AnyEdit introduces an autoregressive, chunk-based knowledge editing framework to update long-form and diverse-formatted knowledge encoded in large language models. Grounded in the chain rule of mutual information, it sequentially edits chunks to avoid interference and scale to arbitrary lengths, while remaining plug-and-play with existing editing methods. Empirically, AnyEdit achieves about a 21.5% average improvement on long-form and diverse-form benchmarks and demonstrates strong generalization across domains, with insights into chunk sizing for practical deployment. The work also contributes the EditEverything dataset and shows that chunk-based autoregressive editing significantly broadens the scope and practicality of knowledge editing in LLMs.

Abstract

Large language models (LLMs) often produce incorrect or outdated information, necessitating efficient and precise knowledge updates. Current model editing methods, however, struggle with long-form knowledge in diverse formats, such as poetry, code snippets, and mathematical derivations. These limitations arise from their reliance on editing a single token's hidden state, a limitation we term "efficacy barrier". To solve this, we propose AnyEdit, a new autoregressive editing paradigm. It decomposes long-form knowledge into sequential chunks and iteratively edits the key token in each chunk, ensuring consistent and accurate outputs. Theoretically, we ground AnyEdit in the Chain Rule of Mutual Information, showing its ability to update any knowledge within LLMs. Empirically, it outperforms strong baselines by 21.5% on benchmarks including UnKEBench, AKEW, and our new EditEverything dataset for long-form diverse-formatted knowledge. Additionally, AnyEdit serves as a plug-and-play framework, enabling current editing methods to update knowledge with arbitrary length and format, significantly advancing the scope and practicality of LLM knowledge editing.

Paper Structure

This paper contains 31 sections, 1 theorem, 37 equations, 9 figures, 4 tables.

Key Result

Theorem 2.1

The optimization objective is equivalent to maximizing the conditional mutual information (CMI) between $X$ and $Y$ given the perturbed hidden state ${\bm{h}}'$:

Figures (9)

  • Figure 1: Comparison of current methods and our AnyEdit. (a) and (d) illustrate the editing processes; (c) and (e) show the editing efficacy as the number of tokens within the to-be-updated knowledge increases; (b) and (f) depict the type of knowledge that each method can edit.
  • Figure 2: Relationship between knowledge format, original output probability, and efficacy when applying advanced editing methods to update triplet-structured and diverse-formatted knowledge. For each category, we randomly sample 200 knowledge instances to conduct experiments.
  • Figure 3: Relationship between the number of tokens in to-be-updated knowledge, probability shift under random perturbations, and efficacy. We conduct experiments by truncating the sampled knowledge instances to enable editing across different token lengths. The lighter-colored bands represent variance.
  • Figure 4: Performance comparison between the AnyEdit approach and baseline methods on long-form diverse-formatted knowledge. (a) The performance of various methods on the EditEverything dataset in relation to the number of target tokens edited. (b) A comparison of different editing methods across various types of knowledge. Knowledge types without underlining represent Rouge-L Score metrics, while those with underlining indicate Bert Score metrics. Best viewed in color.
  • Figure 5: Performance improvements of baseline editing methods (i.e., MEMIT, AlphaEdit and UnKE) after incorporating autoregressive editing paradigm in AnyEdit. The yellow bars represent the original performance of each baseline, while the blue bars represent the performance after the addition. Best viewed in color.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Theorem 2.1
  • proof