Table of Contents
Fetching ...

UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models

Xiaojie Gu, Ziying Huang, Jia-Chen Gu, Kai Zhang

TL;DR

This work proposes UltraEdit, a training-, subject- and memory-free approach that is well-suited for ultra-scalable, real-world lifelong model editing, and is the only method currently capable of editing a 7B LLM on a 24GB consumer-grade GPU.

Abstract

Lifelong learning enables large language models (LLMs) to adapt to evolving information by continually updating their internal knowledge. An ideal system should support efficient, wide-ranging updates while preserving existing capabilities and ensuring reliable deployment. Model editing stands out as a promising solution for this goal, offering a focused and efficient way to revise a model's internal knowledge. Although recent paradigms have made notable progress, they often struggle to meet the demands of practical lifelong adaptation at scale. To bridge this gap, we propose UltraEdit, a training-, subject-, and memory-free approach that is well-suited for ultra-scalable, real-world lifelong model editing. UltraEdit fundamentally differs from traditional paradigms by computing parameter shifts in one step using only a hidden state and its gradient, making the approach simple yet efficient. To improve scalability in lifelong settings, UltraEdit employs a lifelong normalization strategy that continuously updates feature statistics across turns, allowing it to adapt to distributional shifts and maintain consistency over time. UltraEdit achieves editing speeds more than $7\times$ faster than the previous state-of-the-art method, while requiring $4\times$ less VRAM. This makes it the only method currently capable of editing a 7B LLM on a 24GB consumer-grade GPU. Furthermore, we construct UltraEditBench, the largest dataset in the field to date with over 2M editing pairs, and demonstrate that our method supports up to 2M edits while maintaining high accuracy. Comprehensive experiments on five datasets and six models show that UltraEdit consistently achieves superior performance across diverse model editing scenarios, taking a further step towards safe and scalable lifelong learning. Our code is available at https://github.com/XiaojieGu/UltraEdit.

UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models

TL;DR

This work proposes UltraEdit, a training-, subject- and memory-free approach that is well-suited for ultra-scalable, real-world lifelong model editing, and is the only method currently capable of editing a 7B LLM on a 24GB consumer-grade GPU.

Abstract

Lifelong learning enables large language models (LLMs) to adapt to evolving information by continually updating their internal knowledge. An ideal system should support efficient, wide-ranging updates while preserving existing capabilities and ensuring reliable deployment. Model editing stands out as a promising solution for this goal, offering a focused and efficient way to revise a model's internal knowledge. Although recent paradigms have made notable progress, they often struggle to meet the demands of practical lifelong adaptation at scale. To bridge this gap, we propose UltraEdit, a training-, subject-, and memory-free approach that is well-suited for ultra-scalable, real-world lifelong model editing. UltraEdit fundamentally differs from traditional paradigms by computing parameter shifts in one step using only a hidden state and its gradient, making the approach simple yet efficient. To improve scalability in lifelong settings, UltraEdit employs a lifelong normalization strategy that continuously updates feature statistics across turns, allowing it to adapt to distributional shifts and maintain consistency over time. UltraEdit achieves editing speeds more than faster than the previous state-of-the-art method, while requiring less VRAM. This makes it the only method currently capable of editing a 7B LLM on a 24GB consumer-grade GPU. Furthermore, we construct UltraEditBench, the largest dataset in the field to date with over 2M editing pairs, and demonstrate that our method supports up to 2M edits while maintaining high accuracy. Comprehensive experiments on five datasets and six models show that UltraEdit consistently achieves superior performance across diverse model editing scenarios, taking a further step towards safe and scalable lifelong learning. Our code is available at https://github.com/XiaojieGu/UltraEdit.

Paper Structure

This paper contains 35 sections, 14 equations, 9 figures, 17 tables, 1 algorithm.

Figures (9)

  • Figure 1: (a) Average Efficacy and editing time of different solutions on 20K edits from ZsRE, evaluated across GPT-J, Mistral, and LLaMA-3. (b) Variation in average Efficacy as edits accumulate. Dashed lines represent performance on the ZsRE dataset across GPT-J, Mistral, and LLaMA-3, while solid lines represent results on the WikiBigEdit dataset with LLaMA-3.
  • Figure 2: VRAM usage over the course of 20K edits on the ZsRE dataset using different methods with Mistral-7B.
  • Figure 3: This figure illustrates the lifelong editing workflow of UltraEdit, where parameter shifts are applied iteratively across turns using a lifelong normalization mechanism that maintains running statistics of editing-instance features to ensure stable and consistent model behavior over time.
  • Figure 4: Variation in average generalization and specificity as edits accumulate.
  • Figure 5: Efficacy of lifelong editing on Phi-4-14B and Gemma-3-27B.
  • ...and 4 more figures