Model Editing by Standard Fine-Tuning

Govind Gangadhar; Karl Stratos

Model Editing by Standard Fine-Tuning

Govind Gangadhar, Karl Stratos

TL;DR

This work investigates whether standard fine-tuning can serve as an effective model editor. It introduces two minimal changes: optimize the conditional likelihood $p_ heta(o|cπ)$ and augment training with $al P$ (pseudo-paraphrases) and $al R$ (random facts), evaluated on ZsRE and CounterFact with 10k edits each. The results show that these simple modifications enable competitive or superior performance compared with specialized editors under the edit-score metric, even when using parameter-efficient fine-tuning. The study suggests model editing may be achievable through standard training with modest data augmentation, though preserving broader capabilities and generative quality remains challenging.

Abstract

Standard fine-tuning is considered not as effective as specialized methods for model editing due to its comparatively poor performance. However, it is simple, agnostic to the architectural details of the model being edited, and able to leverage advances in standard training techniques with no additional work (e.g., black-box PEFT for computational efficiency), making it an appealing choice for a model editor. In this work, we show that standard fine-tuning alone can yield competitive model editing performance with two minor modifications. First, we optimize the conditional likelihood rather than the full likelihood. Second, in addition to the typical practice of training on randomly paraphrased edit prompts to encourage generalization, we also train on random or similar unedited facts to encourage locality. Our experiments on the ZsRE and CounterFact datasets demonstrate that these simple modifications allow standard fine-tuning to match or outperform highly specialized editors in terms of edit score.

Model Editing by Standard Fine-Tuning

TL;DR

This work investigates whether standard fine-tuning can serve as an effective model editor. It introduces two minimal changes: optimize the conditional likelihood

and augment training with

(pseudo-paraphrases) and

(random facts), evaluated on ZsRE and CounterFact with 10k edits each. The results show that these simple modifications enable competitive or superior performance compared with specialized editors under the edit-score metric, even when using parameter-efficient fine-tuning. The study suggests model editing may be achievable through standard training with modest data augmentation, though preserving broader capabilities and generative quality remains challenging.

Abstract

Paper Structure (17 sections, 2 equations, 2 figures, 9 tables)

This paper contains 17 sections, 2 equations, 2 figures, 9 tables.

Introduction
Task
Method
Contrastive learning.
Related Work
Experiments
Mass-Editing
Single-Editing
Generative Metrics
Analysis
What is the effect of PEFT?
Does layer selection help?
Subject knowledge vs data augmentation.
Conclusions
Mass-Editing on WikiRecent
...and 2 more sections

Figures (2)

Figure 1: An example from CounterFact along with paraphrase and random fact augmentation.
Figure 2: The few-shot prompt we use for paraphrase generation. We selected in-context examples from CounterFact.

Model Editing by Standard Fine-Tuning

TL;DR

Abstract

Model Editing by Standard Fine-Tuning

Authors

TL;DR

Abstract

Table of Contents

Figures (2)