Table of Contents
Fetching ...

Fine-tuning Done Right in Model Editing

Wanli Yang, Fei Sun, Rui Tang, Hongyu Zang, Du Su, Qi Cao, Jingang Wang, Huawei Shen, Xueqi Cheng

TL;DR

This work challenges the long-standing view that fine-tuning is unsuitable for model editing by identifying misapplied training paradigms as the root cause. By restoring a standard breadth-first, mini-batch fine-tuning framework and adopting localized tuning (LocFT-BF), the authors demonstrate strong editing performance across diverse LLMs and tasks, achieving high reliability, robust generalization, and preserved capabilities. The study also provides a systematic analysis of tuning locations, revealing that late-layer MLPdown updates offer a reliable balance between editing success and knowledge preservation. Across lifelong editing benchmarks and large-scale models (up to 72B parameters), LocFT-BF surpasses state-of-the-art methods in both effectiveness and scalability, offering a simple, plug-and-play solution with real-world practicality.

Abstract

Fine-tuning, a foundational method for adapting large language models, has long been considered ineffective for model editing. Here, we challenge this belief, arguing that the reported failure arises not from the inherent limitation of fine-tuning itself, but from adapting it to the sequential nature of the editing task, a single-pass depth-first pipeline that optimizes each sample to convergence before moving on. While intuitive, this depth-first pipeline coupled with sample-wise updating over-optimizes each edit and induces interference across edits. Our controlled experiments reveal that simply restoring fine-tuning to the standard breadth-first (i.e., epoch-based) pipeline with mini-batch optimization substantially improves its effectiveness for model editing. Moreover, fine-tuning in editing also suffers from suboptimal tuning parameter locations inherited from prior methods. Through systematic analysis of tuning locations, we derive LocFT-BF, a simple and effective localized editing method built on the restored fine-tuning framework. Extensive experiments across diverse LLMs and datasets demonstrate that LocFT-BF outperforms state-of-the-art methods by large margins. Notably, to our knowledge, it is the first to sustain 100K edits and 72B-parameter models,10 x beyond prior practice, without sacrificing general capabilities. By clarifying a long-standing misconception and introducing a principled localized tuning strategy, we advance fine-tuning from an underestimated baseline to a leading method for model editing, establishing a solid foundation for future research.

Fine-tuning Done Right in Model Editing

TL;DR

This work challenges the long-standing view that fine-tuning is unsuitable for model editing by identifying misapplied training paradigms as the root cause. By restoring a standard breadth-first, mini-batch fine-tuning framework and adopting localized tuning (LocFT-BF), the authors demonstrate strong editing performance across diverse LLMs and tasks, achieving high reliability, robust generalization, and preserved capabilities. The study also provides a systematic analysis of tuning locations, revealing that late-layer MLPdown updates offer a reliable balance between editing success and knowledge preservation. Across lifelong editing benchmarks and large-scale models (up to 72B parameters), LocFT-BF surpasses state-of-the-art methods in both effectiveness and scalability, offering a simple, plug-and-play solution with real-world practicality.

Abstract

Fine-tuning, a foundational method for adapting large language models, has long been considered ineffective for model editing. Here, we challenge this belief, arguing that the reported failure arises not from the inherent limitation of fine-tuning itself, but from adapting it to the sequential nature of the editing task, a single-pass depth-first pipeline that optimizes each sample to convergence before moving on. While intuitive, this depth-first pipeline coupled with sample-wise updating over-optimizes each edit and induces interference across edits. Our controlled experiments reveal that simply restoring fine-tuning to the standard breadth-first (i.e., epoch-based) pipeline with mini-batch optimization substantially improves its effectiveness for model editing. Moreover, fine-tuning in editing also suffers from suboptimal tuning parameter locations inherited from prior methods. Through systematic analysis of tuning locations, we derive LocFT-BF, a simple and effective localized editing method built on the restored fine-tuning framework. Extensive experiments across diverse LLMs and datasets demonstrate that LocFT-BF outperforms state-of-the-art methods by large margins. Notably, to our knowledge, it is the first to sustain 100K edits and 72B-parameter models,10 x beyond prior practice, without sacrificing general capabilities. By clarifying a long-standing misconception and introducing a principled localized tuning strategy, we advance fine-tuning from an underestimated baseline to a leading method for model editing, establishing a solid foundation for future research.

Paper Structure

This paper contains 26 sections, 7 figures, 8 tables.

Figures (7)

  • Figure 1: Pipeline comparison of FT-M on LLaMA3-8B with 1000 ZsRE samples.
  • Figure 2: Illustration of edit pipeline (Depth-First) and standard pipeline (Breadth-First).
  • Figure 3: Visualization of the learning dynamics for depth-first and breadth-first pipelines.
  • Figure 4: Impact of batch size on editing performance under the BF pipeline on ZsRE.
  • Figure 5: Fine-tuning performance for different locations across three LLMs.
  • ...and 2 more figures