Table of Contents
Fetching ...

Model Editing with Graph-Based External Memory

Yash Kumar Atri, Ahmed Alaa, Thomas Hartvigsen

TL;DR

Large language models struggle with hallucinations and outdated knowledge, making post-training updates desirable but risky due to overfitting and forgetting. HYPE introduces a curvature-aware editing framework that leverages hyperbolic graph construction with Poincaré embeddings, Möbius-transformed updates, and a dual stabilization strategy to preserve hierarchical relations during edits. Across CounterFact, CounterFact+, and MQuAKE, HYPE consistently improves edit quality, factual accuracy, and multi-hop reasoning on GPT-J and GPT2-XL, outperforming Euclidean baselines. The approach offers a principled, efficient pathway to targeted, stable knowledge updates with potential applicability to domain-specific factual maintenance.

Abstract

Large language models (LLMs) have revolutionized natural language processing, yet their practical utility is often limited by persistent issues of hallucinations and outdated parametric knowledge. Although post-training model editing offers a pathway for dynamic updates, existing methods frequently suffer from overfitting and catastrophic forgetting. To tackle these challenges, we propose a novel framework that leverages hyperbolic geometry and graph neural networks for precise and stable model edits. We introduce HYPE (HYperbolic Parameter Editing), which comprises three key components: (i) Hyperbolic Graph Construction, which uses Poincaré embeddings to represent knowledge triples in hyperbolic space, preserving hierarchical relationships and preventing unintended side effects by ensuring that edits to parent concepts do not inadvertently affect child concepts; (ii) Möbius-Transformed Updates, which apply hyperbolic addition to propagate edits while maintaining structural consistency within the hyperbolic manifold, unlike conventional Euclidean updates that distort relational distances; and (iii) Dual Stabilization, which combines gradient masking and periodic GNN parameter resetting to prevent catastrophic forgetting by focusing updates on critical parameters and preserving long-term knowledge. Experiments on CounterFact, CounterFact+, and MQuAKE with GPT-J and GPT2-XL demonstrate that HYPE significantly enhances edit stability, factual accuracy, and multi-hop reasoning.

Model Editing with Graph-Based External Memory

TL;DR

Large language models struggle with hallucinations and outdated knowledge, making post-training updates desirable but risky due to overfitting and forgetting. HYPE introduces a curvature-aware editing framework that leverages hyperbolic graph construction with Poincaré embeddings, Möbius-transformed updates, and a dual stabilization strategy to preserve hierarchical relations during edits. Across CounterFact, CounterFact+, and MQuAKE, HYPE consistently improves edit quality, factual accuracy, and multi-hop reasoning on GPT-J and GPT2-XL, outperforming Euclidean baselines. The approach offers a principled, efficient pathway to targeted, stable knowledge updates with potential applicability to domain-specific factual maintenance.

Abstract

Large language models (LLMs) have revolutionized natural language processing, yet their practical utility is often limited by persistent issues of hallucinations and outdated parametric knowledge. Although post-training model editing offers a pathway for dynamic updates, existing methods frequently suffer from overfitting and catastrophic forgetting. To tackle these challenges, we propose a novel framework that leverages hyperbolic geometry and graph neural networks for precise and stable model edits. We introduce HYPE (HYperbolic Parameter Editing), which comprises three key components: (i) Hyperbolic Graph Construction, which uses Poincaré embeddings to represent knowledge triples in hyperbolic space, preserving hierarchical relationships and preventing unintended side effects by ensuring that edits to parent concepts do not inadvertently affect child concepts; (ii) Möbius-Transformed Updates, which apply hyperbolic addition to propagate edits while maintaining structural consistency within the hyperbolic manifold, unlike conventional Euclidean updates that distort relational distances; and (iii) Dual Stabilization, which combines gradient masking and periodic GNN parameter resetting to prevent catastrophic forgetting by focusing updates on critical parameters and preserving long-term knowledge. Experiments on CounterFact, CounterFact+, and MQuAKE with GPT-J and GPT2-XL demonstrate that HYPE significantly enhances edit stability, factual accuracy, and multi-hop reasoning.

Paper Structure

This paper contains 34 sections, 23 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The illustration delineates our proposed model, HYPE. We begin by constructing a hyperbolic knowledge graph (b) using Poincaré embeddings to encode hierarchical relationships. When an edit is required, we apply Möbius transformations (c) to update the weights while ensuring curvature-aware consistency. To maintain stability, Dual stabilization strategy (d) removes transient or spurious updates. The edited knowledge is then integrated into the model (e), preserving factual accuracy and structural integrity.
  • Figure 2: Left: EDS peaks at $c=1.0$ due to an optimal balance between expansion and numerical stability. Right: Edit success rate declines for $\tau > 0.5$ due to underfitting.