Rebuilding ROME : Resolving Model Collapse during Sequential Model Editing

Akshat Gupta; Sidharth Baskaran; Gopala Anumanchipalli

Rebuilding ROME : Resolving Model Collapse during Sequential Model Editing

Akshat Gupta, Sidharth Baskaran, Gopala Anumanchipalli

TL;DR

This work identifies disabling edits in sequential model editing with Rank-One Model Editing (ROME) as artifacts of asymmetric key-vector usage, leading to abrupt model collapse. It introduces r-ROME, a stable reimplementation that enforces homogeneous key-vectors throughout the update, which substantially reduces update norms $|\Delta|$, preserves downstream performance, and enables large-scale sequential editing. The study also analyzes the root cause with a mathematical account of the update equations and demonstrates that r-ROME outperforms the original ROME and the variant p-ROME in generalization and stability across GPT-J-6B and GPT2-XL on CounterFact benchmarks. Overall, the proposed r-ROME makes sequential editing more reliable and scalable, with a public code release to facilitate adoption, while acknowledging that scale-related degradation remains a current limitation.

Abstract

Recent work using Rank-One Model Editing (ROME), a popular model editing method, has shown that there are certain facts that the algorithm is unable to edit without breaking the model. Such edits have previously been called disabling edits. These disabling edits cause immediate model collapse and limits the use of ROME for sequential editing. In this paper, we show that disabling edits are an artifact of irregularities in the implementation of ROME. With this paper, we provide a more stable implementation ROME, which we call r-ROME and show that model collapse is no longer observed when making large scale sequential edits with r-ROME, while further improving generalization and locality of model editing compared to the original implementation of ROME. We also provide a detailed mathematical explanation of the reason behind disabling edits.

Rebuilding ROME : Resolving Model Collapse during Sequential Model Editing

TL;DR

, preserves downstream performance, and enables large-scale sequential editing. The study also analyzes the root cause with a mathematical account of the update equations and demonstrates that r-ROME outperforms the original ROME and the variant p-ROME in generalization and stability across GPT-J-6B and GPT2-XL on CounterFact benchmarks. Overall, the proposed r-ROME makes sequential editing more reliable and scalable, with a public code release to facilitate adoption, while acknowledging that scale-related degradation remains a current limitation.

Abstract

Paper Structure (11 sections, 5 equations, 9 figures, 4 tables)

This paper contains 11 sections, 5 equations, 9 figures, 4 tables.

Introduction
Background
Evaluating Model Editing.
Experiments
Properties of Disabling Edits
Fixing ROME
Sequential Editing with r-ROME
Conclusion
Limitations
Related Work
Additional Sequential Editing Experiments

Figures (9)

Figure 1: A typical generation example after a disabling edit is compared to a normal model edit using ROME. The bold and underlined part in the text is input prompt.
Figure 2: This figure shows the difference between the ROME and r-ROME updates on GPTJ (6B) for 5k individual edits. Our implementation shows much less potential disabling edits indicated by lower $|\Delta|$ values.
Figure 3: Sequential editing using original implementation of ROME on GPT-J (6B).
Figure 4: Sequential editing with r-ROME on GPT-J.
Figure 5: This figure shows distribution of edits along |Delta| and Normalized Entropy metric for edits using the original ROME implementation on CounterFact dataset for GPT2-XL.
...and 4 more figures

Rebuilding ROME : Resolving Model Collapse during Sequential Model Editing

TL;DR

Abstract

Rebuilding ROME : Resolving Model Collapse during Sequential Model Editing

Authors

TL;DR

Abstract

Table of Contents

Figures (9)