Table of Contents
Fetching ...

GraphToxin: Reconstructing Full Unlearned Graphs from Graph Unlearning

Ying Song, Balaji Palanisamy

TL;DR

GraphToxin presents the first full graph reconstruction attack against graph unlearning, leveraging a three-module framework—gradient matching, curvature matching via a Fisher-information surrogate, and feature-smoothness regularization—to recover deleted nodes, their neighbors, and associated topology from gradient differences $\Delta\mathcal{L}(\mathcal{G}_d,\theta,\theta^*)$. It extends to multiple node removals and data-free black-box scenarios using zeroth-order gradient estimation and semantic calibration, supported by a comprehensive evaluation framework with feature-, global-, and performance-level metrics. Empirical results show GraphToxin outperforms baselines across diverse datasets, GNN backbones, and unlearning methods, and defenses like node-DP and gradient compression largely fail to mitigate the attack. The findings highlight severe privacy risks of current graph unlearning approaches and underscore the need for stronger, more holistic defenses and worst-case evaluations in graph-based privacy research.

Abstract

Graph unlearning has emerged as a promising solution for complying with "the right to be forgotten" regulations by enabling the removal of sensitive information upon request. However, this solution is not foolproof. The involvement of multiple parties creates new attack surfaces, and residual traces of deleted data can still remain in the unlearned graph neural networks. These vulnerabilities can be exploited by attackers to recover the supposedly erased samples, thereby undermining the inherent functionality of graph unlearning. In this work, we propose GraphToxin, the first graph reconstruction attack against graph unlearning. Specifically, we introduce a novel curvature matching module to provide a fine-grained guidance for full unlearned graph recovery. We demonstrate that GraphToxin can successfully subvert the regulatory guarantees expected from graph unlearning - it can recover not only a deleted individual's information and personal links but also sensitive content from their connections, thereby posing substantially more detrimental threats. Furthermore, we extend GraphToxin to multiple node removals under both white-box and black-box setting. We highlight the necessity of a worst-case analysis and propose a comprehensive evaluation framework to systematically assess the attack performance under both random and worst-case node removals. This provides a more robust and realistic measure of the vulnerability of graph unlearning methods to graph reconstruction attacks. Our extensive experiments demonstrate the effectiveness and flexibility of GraphToxin. Notably, we show that existing defense mechanisms are largely ineffective against this attack and, in some cases, can even amplify its performance. Given the severe privacy risks posed by GraphToxin, our work underscores the urgent need for the development of more effective and robust defense strategies against this attack.

GraphToxin: Reconstructing Full Unlearned Graphs from Graph Unlearning

TL;DR

GraphToxin presents the first full graph reconstruction attack against graph unlearning, leveraging a three-module framework—gradient matching, curvature matching via a Fisher-information surrogate, and feature-smoothness regularization—to recover deleted nodes, their neighbors, and associated topology from gradient differences . It extends to multiple node removals and data-free black-box scenarios using zeroth-order gradient estimation and semantic calibration, supported by a comprehensive evaluation framework with feature-, global-, and performance-level metrics. Empirical results show GraphToxin outperforms baselines across diverse datasets, GNN backbones, and unlearning methods, and defenses like node-DP and gradient compression largely fail to mitigate the attack. The findings highlight severe privacy risks of current graph unlearning approaches and underscore the need for stronger, more holistic defenses and worst-case evaluations in graph-based privacy research.

Abstract

Graph unlearning has emerged as a promising solution for complying with "the right to be forgotten" regulations by enabling the removal of sensitive information upon request. However, this solution is not foolproof. The involvement of multiple parties creates new attack surfaces, and residual traces of deleted data can still remain in the unlearned graph neural networks. These vulnerabilities can be exploited by attackers to recover the supposedly erased samples, thereby undermining the inherent functionality of graph unlearning. In this work, we propose GraphToxin, the first graph reconstruction attack against graph unlearning. Specifically, we introduce a novel curvature matching module to provide a fine-grained guidance for full unlearned graph recovery. We demonstrate that GraphToxin can successfully subvert the regulatory guarantees expected from graph unlearning - it can recover not only a deleted individual's information and personal links but also sensitive content from their connections, thereby posing substantially more detrimental threats. Furthermore, we extend GraphToxin to multiple node removals under both white-box and black-box setting. We highlight the necessity of a worst-case analysis and propose a comprehensive evaluation framework to systematically assess the attack performance under both random and worst-case node removals. This provides a more robust and realistic measure of the vulnerability of graph unlearning methods to graph reconstruction attacks. Our extensive experiments demonstrate the effectiveness and flexibility of GraphToxin. Notably, we show that existing defense mechanisms are largely ineffective against this attack and, in some cases, can even amplify its performance. Given the severe privacy risks posed by GraphToxin, our work underscores the urgent need for the development of more effective and robust defense strategies against this attack.

Paper Structure

This paper contains 63 sections, 1 theorem, 16 equations, 6 figures, 8 tables.

Key Result

Theorem 1

When the above assumptions hold, matching the ground-truth gradient difference $\Delta\mathcal{L}(\mathcal{G}_d, \theta,\theta^\ast)$ is equivalent to minimizing $(\Delta_{syn}-\Delta_{obs})^TH^{-1}(\Delta_{syn}-\Delta_{obs})$. where $\Delta_{syn}=\tilde{\Delta}\mathcal{L}(\tilde{\mathcal{G}_d},\the

Figures (6)

  • Figure 1: An overview of GraphToxin. The white-box Gra- phToxin uses gradient difference for full unlearned graph recovery. The black-box GraphToxin first estimates gradie- nts and then exploits their difference for recovery.
  • Figure 2: Illustrative Example of the Ambiguity of Gradient Difference Matching
  • Figure 3: Impact of Fisher Coefficient
  • Figure 4: Impact of Laplacian Coefficient
  • Figure 5: Impact of the Full Unlearned Graph Size
  • ...and 1 more figures

Theorems & Definitions (4)

  • Definition 1: Single Node Unlearning
  • Definition 2: Multiple Node Unlearning
  • Definition 3: General GRA
  • Theorem 1: Fine-grained Curvature Matching