See the Unseen: Better Context-Consistent Knowledge-Editing by Noises

Youcheng Huang; Wenqiang Lei; Zheng Zhang; Jiancheng Lv; Shuicheng Yan

See the Unseen: Better Context-Consistent Knowledge-Editing by Noises

Youcheng Huang, Wenqiang Lei, Zheng Zhang, Jiancheng Lv, Shuicheng Yan

TL;DR

This work tackles the problem of editing LLM knowledge while preserving context-consistency, revealing that context-induced FFN activation shifts follow a Gaussian-like pattern. It proposes Deep Noise Editing (DNE), injecting Gaussian-like noise into FFN activations across multiple layers to simulate unseen contexts during editing, building on ROME/MEMIT frameworks. Empirical results on GPT2-xl, GPT-J, and LLaMA-2 across zsRE and Counterfacts show that DNE improves generalization to paraphrased prompts and related contexts, often outperforming NoisyTune and related baselines. The approach offers a practical path to more robust, context-aware knowledge edits with broad implications for interpretability and safe deployment of edited LLMs.

Abstract

Knowledge-editing updates knowledge of large language models (LLMs) and contributes to the interpretability and application of LLMs. However, knowledge applying is context-consistent: LLMs can recall the same knowledge in different contexts. Existing works ignore this property and the editing lacks generalization. In this paper, we empirically find that the effects of different contexts upon LLMs in recalling the same knowledge follow a Gaussian-like distribution. We then sample Gaussian noises to simulate the effects of different contexts when updating LLMs. By such, we can make LLMs see the unseen contexts where the edited knowledge will be applied, therefore improving the editing generalization. Experimental results on three LLMs demonstrate the effectiveness of our methods and also distinguish our methods from the others of fine-tuning LLMs by noises.

See the Unseen: Better Context-Consistent Knowledge-Editing by Noises

TL;DR

Abstract

Paper Structure (21 sections, 5 equations, 12 figures, 11 tables)

This paper contains 21 sections, 5 equations, 12 figures, 11 tables.

Introduction
Background and Related Works
Knowledge-Editing: the Task Setting
Related Works on Knowledge-Editing
Knowledge-editing and the Interpretability of Transformers
The Knowledge Context-Consistency
FFNs Activation in Paraphrased Contexts
What are the Factors?
See the Unseen: Deep Noise Editing
Experiments & Results
Experimental Settings
Knowledge-Editing using zsRE
Knowledge-Editing using Counterfacts
Experiments with LLaMA-2
Comparing with others of adding noises
...and 6 more sections

Figures (12)

Figure 1: Different contexts place shifts that follow a Gaussian -like distribution to FFNs' activations on knowledge-related tokens. We achieve better context-consistent knowledge-editing by sampling noises to simulate the effects.
Figure 2: GPT2-xl ${\mathbb{H}}_s,\!{\mathbb{H}}_c$.
Figure 3: GPT2-xl ${\mathbb{D}}_s,\!{\mathbb{D}}_c$.
Figure 4: GPT-J ${\mathbb{H}}_s,\!{\mathbb{H}}_c$.
Figure 5: GPT2-J ${\mathbb{D}}_s,\!{\mathbb{D}}_c$.
...and 7 more figures

See the Unseen: Better Context-Consistent Knowledge-Editing by Noises

TL;DR

Abstract

See the Unseen: Better Context-Consistent Knowledge-Editing by Noises

Authors

TL;DR

Abstract

Table of Contents

Figures (12)