Table of Contents
Fetching ...

GONE: Structural Knowledge Unlearning via Neighborhood-Expanded Distribution Shaping

Chahana Dahal, Ashutosh Balasubramaniam, Zuobin Xiong

Abstract

Unlearning knowledge is a pressing and challenging task in Large Language Models (LLMs) because of their unprecedented capability to memorize and digest training data at scale, raising more significant issues regarding safety, privacy, and intellectual property. However, existing works, including parameter editing, fine-tuning, and distillation-based methods, are all focused on flat sentence-level data but overlook the relational, multi-hop, and reasoned knowledge in naturally structured data. In response to this gap, this paper introduces Graph Oblivion and Node Erasure (GONE), a benchmark for evaluating knowledge unlearning over structured knowledge graph (KG) facts in LLMs. This KG-based benchmark enables the disentanglement of three effects of unlearning: direct fact removal, reasoning-based leakage, and catastrophic forgetting. In addition, Neighborhood-Expanded Distribution Shaping (NEDS), a novel unlearning framework, is designed to leverage graph connectivity and identify anchor correlated neighbors, enforcing a precise decision boundary between the forgotten fact and its semantic neighborhood. Evaluations on LLaMA-3-8B and Mistral-7B across multiple knowledge editing and unlearning methods showcase NEDS's superior performance (1.000 on unlearning efficacy and 0.839 on locality) on GONE and other benchmarks. Code is available at https://anonymous.4open.science/r/GONE-4679/.

GONE: Structural Knowledge Unlearning via Neighborhood-Expanded Distribution Shaping

Abstract

Unlearning knowledge is a pressing and challenging task in Large Language Models (LLMs) because of their unprecedented capability to memorize and digest training data at scale, raising more significant issues regarding safety, privacy, and intellectual property. However, existing works, including parameter editing, fine-tuning, and distillation-based methods, are all focused on flat sentence-level data but overlook the relational, multi-hop, and reasoned knowledge in naturally structured data. In response to this gap, this paper introduces Graph Oblivion and Node Erasure (GONE), a benchmark for evaluating knowledge unlearning over structured knowledge graph (KG) facts in LLMs. This KG-based benchmark enables the disentanglement of three effects of unlearning: direct fact removal, reasoning-based leakage, and catastrophic forgetting. In addition, Neighborhood-Expanded Distribution Shaping (NEDS), a novel unlearning framework, is designed to leverage graph connectivity and identify anchor correlated neighbors, enforcing a precise decision boundary between the forgotten fact and its semantic neighborhood. Evaluations on LLaMA-3-8B and Mistral-7B across multiple knowledge editing and unlearning methods showcase NEDS's superior performance (1.000 on unlearning efficacy and 0.839 on locality) on GONE and other benchmarks. Code is available at https://anonymous.4open.science/r/GONE-4679/.
Paper Structure (46 sections, 16 equations, 17 figures, 16 tables)

This paper contains 46 sections, 16 equations, 17 figures, 16 tables.

Figures (17)

  • Figure 1: Overview of the overall framework.
  • Figure 2: Topological Orthogonality in GONE. We enforce a strict separation between the Forget Neighborhood ($\mathcal{F}$, $k \le 3$) and the Retain Set ($\mathcal{R}$). The schema barrier and geodesic distance constraint ($d(t, t') > 3$) ensure that retained facts (e.g., Germany) provide no latent inference path to reconstruct the forgotten target (e.g., Nobel Prize).
  • Figure 3: The chart shows the change in Knowledge Connectivity Score ($\Delta$ KCS) for baselines on GONE.
  • Figure 4: The Unlearning Trade-off on GONE (ConceptNet) for Llama-3-8B-Instruct. The color intensity represents the Refusal Rate.
  • Figure 5: System prompt used for the dataset generator.
  • ...and 12 more figures