Table of Contents
Fetching ...

Knowledge Editing with Subspace-Aware Key-Value Mappings

Haewon Park, Sangwoo Kim, Yohan Jo

TL;DR

This work addresses the challenge of editing factual knowledge in large language models while preserving unrelated information. It introduces Subspace Knowledge Edit (SUIT), a subspace-aware framework that constrains edits to an entity-specific feature subspace for keys and a two-dimensional residual subspace for updates, leveraging the Linear Representation Hypothesis. By decomposing the key and residual updates with SVD-guided projections and a compact residual subspace, SUIT achieves high edit efficacy with substantially improved specificity and minimal perturbation to general capabilities, across multiple models and datasets. Comprehensive analyses, including last-token perturbation reduction and subspace diagnostics, support the subspace localization claims and demonstrate scalability to larger models. Overall, SUIT provides a principled, scalable approach to precise knowledge editing with strong practical impact for maintaining model fidelity during updates.

Abstract

Knowledge editing aims to efficiently correct factual errors in Language Models (LMs). The popular locate-then-edit approach modifies an MLP layer by finding an optimal mapping between its input vector (key) and output vector (value) that leads to the expression of the edited knowledge. However, existing methods without any constraints on the key and value vectors cause significant perturbations to the edited model. To address this, we propose Subspace Knowledge Edit (SUIT), a method that identifies and modifies only the subspace of critical features relevant to the edit. Our empirical results on LLaMA-3-8B, GPT-J-6B, and Qwen2.5-7B models show that SUIT dramatically improves knowledge preservation over strong baselines while maintaining high edit efficacy. This effectiveness confirms that SUIT successfully identifies the critical subspace for the edit. Further analyses provide additional validation for our approach. The source code and data will be released to the public upon publication of the paper.

Knowledge Editing with Subspace-Aware Key-Value Mappings

TL;DR

This work addresses the challenge of editing factual knowledge in large language models while preserving unrelated information. It introduces Subspace Knowledge Edit (SUIT), a subspace-aware framework that constrains edits to an entity-specific feature subspace for keys and a two-dimensional residual subspace for updates, leveraging the Linear Representation Hypothesis. By decomposing the key and residual updates with SVD-guided projections and a compact residual subspace, SUIT achieves high edit efficacy with substantially improved specificity and minimal perturbation to general capabilities, across multiple models and datasets. Comprehensive analyses, including last-token perturbation reduction and subspace diagnostics, support the subspace localization claims and demonstrate scalability to larger models. Overall, SUIT provides a principled, scalable approach to precise knowledge editing with strong practical impact for maintaining model fidelity during updates.

Abstract

Knowledge editing aims to efficiently correct factual errors in Language Models (LMs). The popular locate-then-edit approach modifies an MLP layer by finding an optimal mapping between its input vector (key) and output vector (value) that leads to the expression of the edited knowledge. However, existing methods without any constraints on the key and value vectors cause significant perturbations to the edited model. To address this, we propose Subspace Knowledge Edit (SUIT), a method that identifies and modifies only the subspace of critical features relevant to the edit. Our empirical results on LLaMA-3-8B, GPT-J-6B, and Qwen2.5-7B models show that SUIT dramatically improves knowledge preservation over strong baselines while maintaining high edit efficacy. This effectiveness confirms that SUIT successfully identifies the critical subspace for the edit. Further analyses provide additional validation for our approach. The source code and data will be released to the public upon publication of the paper.

Paper Structure

This paper contains 80 sections, 38 equations, 13 figures, 17 tables.

Figures (13)

  • Figure 1: Mean F1 score degradation during a 5,000-edit setting.
  • Figure 2: Comparison of token-level perturbations in residual streams.
  • Figure 3: Norm differences of MLP outputs at the subject entity’s last token position across methods.
  • Figure 4: Effects of $\Delta \mathbf{w}_1$ and $\Delta \mathbf{w}_2$ on the logits of "Apple" and "Google"
  • Figure 5: Tradeoff analysis for hyperparameters $\tau_{\text{energy}}$ and $\lambda$.
  • ...and 8 more figures