Table of Contents
Fetching ...

KnowledgeSmith: Uncovering Knowledge Updating in LLMs with Model Editing and Unlearning

Yinyi Luo, Zhexian Zhou, Hao Chen, Kai Qiu, Marios Savvides, Sharon Li, Jindong Wang

TL;DR

KnowledgeSmith studies how LLMs update knowledge by unifying editing and unlearning as constrained optimization, with the update operator $\\theta'=\\mathcal{T}(\\theta; e, c)$ and distance constraints on positive and preservation probes. It introduces a KG-based automatic data-generation pipeline producing root/intermediate/leaf interventions and six probe types across four domains, enabling scalable evaluation of propagation, consistency, and robustness. A geometric SVD-based interpretation shows editing behaves like rotation plus mild scaling while unlearning resembles anisotropic scaling, aligning with observed propagation and robustness patterns. Together, the framework and benchmarks offer a practical toolset to guide reliable, domain-aware knowledge updates in LLMs.

Abstract

Knowledge editing and machine unlearning are two popular approaches for large language models (LLMs) to stay up-to-date. However, the knowledge updating mechanism of LLMs remains largely unexplored due to insufficient, isolated, and small-scale evaluation. For instance, are LLMs similar to humans in modifying certain knowledge? What differs editing and unlearning as training data increases? This paper proposes KnowledgeSmith, a unified framework to systematically understand the updating mechanism of LLMs. We first cast editing and unlearning as instances of one constrained optimization problem. Then, we propose an automatic dataset generator that provides structured interventions across multiple graph levels and data scales, enabling controlled studies of how different modification strategies propagate through model knowledge. Extensive experiments demonstrate nuanced insights over knowledge propagation, plasticity scaling, consistency, and robustness. For instance, our results show that LLMs do not exhibit similar updating as humans for different levels of knowledge, and there exists consistency-capacity trade-off. We hope our findings can offer suggestions to the design of more reliable and scalable strategies. Code: https://github.com/AIFrontierLab/KnowledgeSmith.git

KnowledgeSmith: Uncovering Knowledge Updating in LLMs with Model Editing and Unlearning

TL;DR

KnowledgeSmith studies how LLMs update knowledge by unifying editing and unlearning as constrained optimization, with the update operator and distance constraints on positive and preservation probes. It introduces a KG-based automatic data-generation pipeline producing root/intermediate/leaf interventions and six probe types across four domains, enabling scalable evaluation of propagation, consistency, and robustness. A geometric SVD-based interpretation shows editing behaves like rotation plus mild scaling while unlearning resembles anisotropic scaling, aligning with observed propagation and robustness patterns. Together, the framework and benchmarks offer a practical toolset to guide reliable, domain-aware knowledge updates in LLMs.

Abstract

Knowledge editing and machine unlearning are two popular approaches for large language models (LLMs) to stay up-to-date. However, the knowledge updating mechanism of LLMs remains largely unexplored due to insufficient, isolated, and small-scale evaluation. For instance, are LLMs similar to humans in modifying certain knowledge? What differs editing and unlearning as training data increases? This paper proposes KnowledgeSmith, a unified framework to systematically understand the updating mechanism of LLMs. We first cast editing and unlearning as instances of one constrained optimization problem. Then, we propose an automatic dataset generator that provides structured interventions across multiple graph levels and data scales, enabling controlled studies of how different modification strategies propagate through model knowledge. Extensive experiments demonstrate nuanced insights over knowledge propagation, plasticity scaling, consistency, and robustness. For instance, our results show that LLMs do not exhibit similar updating as humans for different levels of knowledge, and there exists consistency-capacity trade-off. We hope our findings can offer suggestions to the design of more reliable and scalable strategies. Code: https://github.com/AIFrontierLab/KnowledgeSmith.git

Paper Structure

This paper contains 40 sections, 4 equations, 8 figures, 15 tables.

Figures (8)

  • Figure 1: KnowledgeSmith pipeline. Starting from static KG, we generate dynamic probes at root, intermediate, and leaf levels, enabling evaluation of direct and propagated effects.
  • Figure 2: Propagation asymmetry metrics.
  • Figure 3: Plasticity scaling of the LLaMA3 family under (a) editing and (b) unlearning. (c) Propagation limits across three branches. (d) Consistency capacity tradeoff.
  • Figure 4: Robustness evaluation under multiple stress tests. (a) Out-of-distribution (OOD) vs. in-domain accuracy. (b) Adversarial robustness relative to original accuracy. (c) Instruction-following accuracy in free generation, judged by an LLM. (d) Hallucination tendency across interventions.
  • Figure 5: LoRA, editing, and unlearning.
  • ...and 3 more figures