Table of Contents
Fetching ...

Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors

Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi

TL;DR

Aging with GRACE introduces lifelong model editing by attaching discrete codebook-based Adaptors to pre-trained transformers, allowing targeted, input-driven edits without altering weights. Through a deferral-based retrieval mechanism and progressive codebook maintenance, GRACE achieves thousands of sequential edits while preserving training-data behavior and prior corrections. Empirical results across T5, BERT, and GPT2-XL show GRACE outperforms baselines on edit success and retention metrics with compact codebooks and modest inference-time overhead. The approach offers a plug-and-play, parameter-efficient solution for mitigating deployment-time errors such as hallucinations or label shifts, with interpretability via detachable codebooks. Overall, GRACE demonstrates scalable, generalizable, and efficient lifelong editing for large-scale language models.

Abstract

Deployed language models decay over time due to shifting inputs, changing user needs, or emergent world-knowledge gaps. When such problems are identified, we want to make targeted edits while avoiding expensive retraining. However, current model editors, which modify such behaviors of pre-trained models, degrade model performance quickly across multiple, sequential edits. We propose GRACE, a lifelong model editing method, which implements spot-fixes on streaming errors of a deployed model, ensuring minimal impact on unrelated inputs. GRACE writes new mappings into a pre-trained model's latent space, creating a discrete, local codebook of edits without altering model weights. This is the first method enabling thousands of sequential edits using only streaming errors. Our experiments on T5, BERT, and GPT models show GRACE's state-of-the-art performance in making and retaining edits, while generalizing to unseen inputs. Our code is available at https://www.github.com/thartvigsen/grace}.

Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors

TL;DR

Aging with GRACE introduces lifelong model editing by attaching discrete codebook-based Adaptors to pre-trained transformers, allowing targeted, input-driven edits without altering weights. Through a deferral-based retrieval mechanism and progressive codebook maintenance, GRACE achieves thousands of sequential edits while preserving training-data behavior and prior corrections. Empirical results across T5, BERT, and GPT2-XL show GRACE outperforms baselines on edit success and retention metrics with compact codebooks and modest inference-time overhead. The approach offers a plug-and-play, parameter-efficient solution for mitigating deployment-time errors such as hallucinations or label shifts, with interpretability via detachable codebooks. Overall, GRACE demonstrates scalable, generalizable, and efficient lifelong editing for large-scale language models.

Abstract

Deployed language models decay over time due to shifting inputs, changing user needs, or emergent world-knowledge gaps. When such problems are identified, we want to make targeted edits while avoiding expensive retraining. However, current model editors, which modify such behaviors of pre-trained models, degrade model performance quickly across multiple, sequential edits. We propose GRACE, a lifelong model editing method, which implements spot-fixes on streaming errors of a deployed model, ensuring minimal impact on unrelated inputs. GRACE writes new mappings into a pre-trained model's latent space, creating a discrete, local codebook of edits without altering model weights. This is the first method enabling thousands of sequential edits using only streaming errors. Our experiments on T5, BERT, and GPT models show GRACE's state-of-the-art performance in making and retaining edits, while generalizing to unseen inputs. Our code is available at https://www.github.com/thartvigsen/grace}.
Paper Structure (57 sections, 1 equation, 18 figures, 4 tables, 1 algorithm)

This paper contains 57 sections, 1 equation, 18 figures, 4 tables, 1 algorithm.

Figures (18)

  • Figure 1: Overview of lifelong model editing with GRACE. a) Models make important errors that must be corrected. b) GRACE makes edits by learning, caching, and selectively retrieving new transformations between layers. c) Edits appear sporadically and require quick fixes, so GRACE codebooks are curated over long sequences of edits.
  • Figure 2: Illustrative example of GRACE. We train a model on separable data in (a), then introduce locally-flipped labels at test time in (b). In (c), the original model unsurprisingly misclassifies these label-flipped instances. In (d), GRACE fixes these labels without impacting other inputs.
  • Figure 3: ES and TRR while editing GPT2-XL on Hallucination. Lower values are better because TRR and ERR measure perplexity. GRACE outperforms the comparisons by making successful edits while maintaining the model's training knowledge. All metrics are shown in Figure \ref{['fig:hallucination_bars_over_time_all']}.
  • Figure 4: Impact of $\epsilon_\text{init}$ and block choice for GRACE editing T5 on zsRE for 3000 sequential edits. Other $\epsilon_\text{init}$ values are in Appendix \ref{['app:hyperparams']}. Along with TRR and ERR, we also measure F1 on a "Holdout" edit set containing unseen rephrasings of all edits. We find that blocks 0 and 6 use more keys and achieve higher TRR, but can lead to lower ERR and generalize worse, given lower holdout values.
  • Figure 5: Interpreting GRACE keys throughout editing. Larger $\epsilon_\text{init}$ achieve good generalization. The grey line is the true holdouts per edit.
  • ...and 13 more figures