Table of Contents
Fetching ...

Consecutive Batch Model Editing with HooK Layers

Shuaiyi Li, Yang Deng, Deng Cai, Hongyuan Lu, Liang Chen, Wai Lam

TL;DR

This work tackles the practical problem of updating knowledge in large language models without costly retraining or unbounded memory growth. It introduces CoachHooK, a memory-efficient editing framework that enables consecutive batch edits by inserting hook layers to capture editing changes while leaving the base weights intact; a transformer memory-updating mechanism and dynamic local-scope detection govern when and where edits apply. By leveraging outlier-based local editing scope identification and a dynamic threshold $\alpha$, CoachHooK demonstrates strong performance on ZsRE and COUNTERFACT across GPT2-XL and GPT-J in both single-round and consecutive editing scenarios, often outperforming batch-editing baselines in reliability, generality, and locality. The approach achieves these gains with modest inference-time and memory overhead, suggesting practical applicability for iterative knowledge updates in real-world LLM deployments.

Abstract

As the typical retraining paradigm is unacceptably time- and resource-consuming, researchers are turning to model editing to find an effective way that supports both consecutive and batch scenarios to edit the model behavior directly. Despite all these practical expectations, existing model editing methods fail to realize all of them. Furthermore, the memory demands for such sequential model editing approaches tend to be prohibitive, frequently necessitating an external memory that grows incrementally over time. To cope with these challenges, we propose CoachHooK, a model editing method that simultaneously supports sequential and batch editing. CoachHooK is memory-friendly as it only needs a small amount of it to store several hook layers whose size remains unchanged over time. Experimental results demonstrate the superiority of our method over other batch-supportive model editing methods under both single-round and consecutive batch editing scenarios. Extensive analyses of CoachHooK have been conducted to verify the stability of our method over a number of consecutive steps.

Consecutive Batch Model Editing with HooK Layers

TL;DR

This work tackles the practical problem of updating knowledge in large language models without costly retraining or unbounded memory growth. It introduces CoachHooK, a memory-efficient editing framework that enables consecutive batch edits by inserting hook layers to capture editing changes while leaving the base weights intact; a transformer memory-updating mechanism and dynamic local-scope detection govern when and where edits apply. By leveraging outlier-based local editing scope identification and a dynamic threshold , CoachHooK demonstrates strong performance on ZsRE and COUNTERFACT across GPT2-XL and GPT-J in both single-round and consecutive editing scenarios, often outperforming batch-editing baselines in reliability, generality, and locality. The approach achieves these gains with modest inference-time and memory overhead, suggesting practical applicability for iterative knowledge updates in real-world LLM deployments.

Abstract

As the typical retraining paradigm is unacceptably time- and resource-consuming, researchers are turning to model editing to find an effective way that supports both consecutive and batch scenarios to edit the model behavior directly. Despite all these practical expectations, existing model editing methods fail to realize all of them. Furthermore, the memory demands for such sequential model editing approaches tend to be prohibitive, frequently necessitating an external memory that grows incrementally over time. To cope with these challenges, we propose CoachHooK, a model editing method that simultaneously supports sequential and batch editing. CoachHooK is memory-friendly as it only needs a small amount of it to store several hook layers whose size remains unchanged over time. Experimental results demonstrate the superiority of our method over other batch-supportive model editing methods under both single-round and consecutive batch editing scenarios. Extensive analyses of CoachHooK have been conducted to verify the stability of our method over a number of consecutive steps.
Paper Structure (47 sections, 20 equations, 11 figures, 5 tables)

This paper contains 47 sections, 20 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Single layer update with hook layer (residual connections are omitted). $\parallel.\parallel$ means calculate the L2-norm over the keys' dimension ($m$). For each updating of a single batch edits, the temporary hook layer is used at the beginning to ensure $\Delta$ is computed based on $W_{h}^l$. After the weights update, the validated hook layer is applied to determine whether to use the original layer or hook layer for each token. This process can be implemented iteratively to support consecutive batch editing. Note that the temporary hook layer weight of a new iteration is copied from the validated hook layer weight of the previous iteration. So, the validated hook layer keeps track of the updated layer from previous edits by retaining the weight from the previous iteration.
  • Figure 2: Multiple layer update with hook layer (Attention module and the first layer of FFN are omitted). The value vector $v_i$ is first computed at the last editing layer, and then we iteratively insert a fraction of the residual to each editing layer (I, II, III) using Eq. \ref{['eq.8']}. Since changing one layer would affect the activations of downstream layers, recollection of the activations is conducted after each iteration. At the beginning, temporary hook layers are initialized to all editing layers. Once the hook layer weight is updated, it is replaced by the validated hook layer (1, 2, 3).
  • Figure 3: Difference between the z-score entry to the updated key $Z^l_{key}$ and average of $Z^l$. The x-axis represents the sample index.
  • Figure 4: Ablation study.
  • Figure 5: Performance comparisons on initial five different values of $\lambda$.
  • ...and 6 more figures