Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model Editing
Bhiman Kumar Baghel, Emma Jordan, Zheyuan Ryan Shi, Xiang Lorraine Li
TL;DR
This work tackles the challenge of keeping LLM knowledge up-to-date by addressing two failure modes of model editing: UnderEdit (failure to inject updates) and OverEdit (unintended changes to neighboring knowledge). It introduces two complementary strategies—Iterative Model Editing and Neighbor-Assisted Model Editing—that play within the standard two-stage locate-and-edit framework (Optimization to learn ideal hidden states in causal layers, followed by Spread to update final MLP weights). Empirical results across multiple editors (MEMIT, PMET, AlphaEdit, ROME), model families (GPT-2 XL, GPT-J, Llama-2, Llama-3.1), and factual benchmarks (COUNTERFACT, ZsRE) show substantial reductions in UnderEdit (up to 38 percentage points) and OverEdit (up to 6 points), with iterative editing enhancing edit success and neighbor-assisted editing boosting specificity. The methods are broadly applicable to existing and future locate-and-edit approaches and come with code release for reproducibility and integration into practical knowledge-update pipelines.
Abstract
Large Language Models (LLMs) are widely deployed in downstream tasks, but keeping their knowledge up-to-date via retraining or fine-tuning is often computationally expensive. Model editing provides a more efficient alternative by updating a targeted subset of parameters, which often follows the locate-and-edit paradigm. Despite this efficiency, existing methods are limited: edits may fail to inject knowledge (UnderEdit) or unintentionally disrupt unrelated neighboring knowledge (OverEdit). To address these challenges, we propose two complementary methods: iterative model editing, which applies successive edits to mitigate UnderEdit, and neighbor-assisted model editing, which incorporates neighboring knowledge during editing to reduce OverEdit. Our extensive experiments show that these techniques improve editing performance across multiple LLMs, algorithms, and benchmarks, reducing UnderEdit by up to 38 percentage points and OverEdit by up to 6, while remaining broadly applicable to any locate-and-edit method. We release our code at https://github.com/bhimanbaghel/ResolveUnderOverEdit.
