Table of Contents
Fetching ...

MetaKE: Meta-learning Aligned Knowledge Editing via Bi-level Optimization

Shuxin Liu, Ou Wu

Abstract

Knowledge editing (KE) aims to precisely rectify specific knowledge in Large Language Models (LLMs) without disrupting general capabilities. State-of-the-art methods suffer from an open-loop control mismatch. We identify a critical "Semantic-Execution Disconnect": the semantic target is derived independently without feedback from the downstream's feasible region. This misalignment often causes valid semantic targets to fall within the prohibited space, resulting in gradient truncation and editing failure. To bridge this gap, we propose MetaKE (Meta-learning Aligned Knowledge Editing), a new framework that reframes KE as a bi-level optimization problem. Departing from static calculation, MetaKE treats the edit target as a learnable meta-parameter: the upper-level optimizer seeks a feasible target to maximize post-edit performance, while the lower-level solver executes the editing. To address the challenge of differentiating through complex solvers, we derive a Structural Gradient Proxy, which explicitly backpropagates editability constraints to the target learning phase. Theoretical analysis demonstrates that MetaKE automatically aligns the edit direction with the model's feasible manifold. Extensive experiments confirm that MetaKE significantly outperforms strong baselines, offering a new perspective on knowledge editing.

MetaKE: Meta-learning Aligned Knowledge Editing via Bi-level Optimization

Abstract

Knowledge editing (KE) aims to precisely rectify specific knowledge in Large Language Models (LLMs) without disrupting general capabilities. State-of-the-art methods suffer from an open-loop control mismatch. We identify a critical "Semantic-Execution Disconnect": the semantic target is derived independently without feedback from the downstream's feasible region. This misalignment often causes valid semantic targets to fall within the prohibited space, resulting in gradient truncation and editing failure. To bridge this gap, we propose MetaKE (Meta-learning Aligned Knowledge Editing), a new framework that reframes KE as a bi-level optimization problem. Departing from static calculation, MetaKE treats the edit target as a learnable meta-parameter: the upper-level optimizer seeks a feasible target to maximize post-edit performance, while the lower-level solver executes the editing. To address the challenge of differentiating through complex solvers, we derive a Structural Gradient Proxy, which explicitly backpropagates editability constraints to the target learning phase. Theoretical analysis demonstrates that MetaKE automatically aligns the edit direction with the model's feasible manifold. Extensive experiments confirm that MetaKE significantly outperforms strong baselines, offering a new perspective on knowledge editing.
Paper Structure (55 sections, 8 theorems, 71 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 55 sections, 8 theorems, 71 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Assume $\boldsymbol{C}+\lambda_{\mathrm{ridge}}\boldsymbol{I}\succ \boldsymbol{0}$. The realized residual is a scaled version of the target: $\boldsymbol{\delta}_{\mathrm{real}} = \beta\,\boldsymbol{\delta}$, where $\beta = \frac{\gamma}{1+\gamma}$ and $\gamma = \sum_{j=1}^{d} \frac{(\boldsymbol{u}_

Figures (2)

  • Figure 1: The Semantic--Execution Disconnect: (A) Spectral Suppression attenuates the update signal; (B) The Static Regularization Trap shows the mismatch between isotropic trust regions and anisotropic feasibility.
  • Figure 2: The architecture of proposed method MetaKE

Theorems & Definitions (13)

  • Theorem 1: Spectral Suppression
  • Theorem 2: Static Regularization Trap
  • Theorem 3: Dominance and Fidelity
  • Lemma 1: Penalty--trust-region KKT equivalence for the quadratic surrogate
  • proof
  • Lemma 2: inscribed/circumscribed ball bounds
  • proof
  • Lemma 3: Inverse perturbation bound
  • proof
  • Lemma 4: Geometry discrepancy bound
  • ...and 3 more