Table of Contents
Fetching ...

Forget and Explain: Transparent Verification of GNN Unlearning

Imran Ahsan, Hyunwook Yu, Jinsung Kim, Mucheol Kim

TL;DR

The paper tackles the challenge of verifiable forgetting in GNNs under privacy regulations by introducing an explainability-driven verifier that compares pre- and post-deletion model states using attribution shifts and local structural changes, quantified through five metrics (RA、HS、ESD、GED、GRS) and augmented by a graph-wide MI ROC-AUC signal. The approach is model- and method-agnostic, applying to multiple backbones (GCN, GAT) and datasets, and yields a human-readable audit trail of forgetting. It demonstrates that deletion-based unlearning (Retrain, GNNDelete) can achieve near-complete forgetting, while GraphEditor partially erases influence and IDEA may leave residual signals, with MI serving as a supplementary privacy check. This framework enhances transparency, trust, and regulatory compliance by providing auditable evidence of forgetting beyond accuracy or MI metrics, and lays groundwork for broader verification of GNN unlearning in real-world settings.

Abstract

Graph neural networks (GNNs) are increasingly used to model complex patterns in graph-structured data. However, enabling them to "forget" designated information remains challenging, especially under privacy regulations such as the GDPR. Existing unlearning methods largely optimize for efficiency and scalability, yet they offer little transparency, and the black-box nature of GNNs makes it difficult to verify whether forgetting has truly occurred. We propose an explainability-driven verifier for GNN unlearning that snapshots the model before and after deletion, using attribution shifts and localized structural changes (for example, graph edit distance) as transparent evidence. The verifier uses five explainability metrics: residual attribution, heatmap shift, explainability score deviation, graph edit distance, and a diagnostic graph rule shift. We evaluate two backbones (GCN, GAT) and four unlearning strategies (Retrain, GraphEditor, GNNDelete, IDEA) across five benchmarks (Cora, Citeseer, Pubmed, Coauthor-CS, Coauthor-Physics). Results show that Retrain and GNNDelete achieve near-complete forgetting, GraphEditor provides partial erasure, and IDEA leaves residual signals. These explanation deltas provide the primary, human-readable evidence of forgetting; we also report membership-inference ROC-AUC as a complementary, graph-wide privacy signal.

Forget and Explain: Transparent Verification of GNN Unlearning

TL;DR

The paper tackles the challenge of verifiable forgetting in GNNs under privacy regulations by introducing an explainability-driven verifier that compares pre- and post-deletion model states using attribution shifts and local structural changes, quantified through five metrics (RA、HS、ESD、GED、GRS) and augmented by a graph-wide MI ROC-AUC signal. The approach is model- and method-agnostic, applying to multiple backbones (GCN, GAT) and datasets, and yields a human-readable audit trail of forgetting. It demonstrates that deletion-based unlearning (Retrain, GNNDelete) can achieve near-complete forgetting, while GraphEditor partially erases influence and IDEA may leave residual signals, with MI serving as a supplementary privacy check. This framework enhances transparency, trust, and regulatory compliance by providing auditable evidence of forgetting beyond accuracy or MI metrics, and lays groundwork for broader verification of GNN unlearning in real-world settings.

Abstract

Graph neural networks (GNNs) are increasingly used to model complex patterns in graph-structured data. However, enabling them to "forget" designated information remains challenging, especially under privacy regulations such as the GDPR. Existing unlearning methods largely optimize for efficiency and scalability, yet they offer little transparency, and the black-box nature of GNNs makes it difficult to verify whether forgetting has truly occurred. We propose an explainability-driven verifier for GNN unlearning that snapshots the model before and after deletion, using attribution shifts and localized structural changes (for example, graph edit distance) as transparent evidence. The verifier uses five explainability metrics: residual attribution, heatmap shift, explainability score deviation, graph edit distance, and a diagnostic graph rule shift. We evaluate two backbones (GCN, GAT) and four unlearning strategies (Retrain, GraphEditor, GNNDelete, IDEA) across five benchmarks (Cora, Citeseer, Pubmed, Coauthor-CS, Coauthor-Physics). Results show that Retrain and GNNDelete achieve near-complete forgetting, GraphEditor provides partial erasure, and IDEA leaves residual signals. These explanation deltas provide the primary, human-readable evidence of forgetting; we also report membership-inference ROC-AUC as a complementary, graph-wide privacy signal.

Paper Structure

This paper contains 11 sections, 6 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Overview of the explainability-driven verification pipeline