OpenGU: A Comprehensive Benchmark for Graph Unlearning

Bowen Fan; Yuming Ai; Xunkai Li; Zhilin Guo; Rong-Hua Li; Guoren Wang

OpenGU: A Comprehensive Benchmark for Graph Unlearning

Bowen Fan, Yuming Ai, Xunkai Li, Zhilin Guo, Rong-Hua Li, Guoren Wang

TL;DR

OpenGU addresses the pressing need for fair, scalable benchmarking of graph unlearning by introducing a unified platform that integrates $16$ SOTA GU algorithms over $37$ multi-domain datasets, enabling a $3\times3$ cross-task evaluation across $13$ GNN backbones. The benchmark provides unified APIs and evaluates GU methods along three dimensions—effectiveness, efficiency, and robustness—under realistic unlearning requests and privacy-attacks such as membership inference and poisoning. Across node, edge, and graph tasks, OpenGU reveals strong performance for certain learning-based and IF-based methods, highlights challenges like memory/time bottlenecks on large graphs, and exposes robustness gaps under noise, sparsity, and varying unlearning intensities. These insights illuminate practical pathways for generalized GU frameworks, standardized forgetting metrics, and scalable, privacy-preserving graph learning in real systems.

Abstract

Graph Machine Learning is essential for understanding and analyzing relational data. However, privacy-sensitive applications demand the ability to efficiently remove sensitive information from trained graph neural networks (GNNs), avoiding the unnecessary time and space overhead caused by retraining models from scratch. To address this issue, Graph Unlearning (GU) has emerged as a critical solution, with the potential to support dynamic graph updates in data management systems and enable scalable unlearning in distributed data systems while ensuring privacy compliance. Unlike machine unlearning in computer vision or other fields, GU faces unique difficulties due to the non-Euclidean nature of graph data and the recursive message-passing mechanism of GNNs. Additionally, the diversity of downstream tasks and the complexity of unlearning requests further amplify these challenges. Despite the proliferation of diverse GU strategies, the absence of a benchmark providing fair comparisons for GU, and the limited flexibility in combining downstream tasks and unlearning requests, have yielded inconsistencies in evaluations, hindering the development of this domain. To fill this gap, we present OpenGU, the first GU benchmark, where 16 SOTA GU algorithms and 37 multi-domain datasets are integrated, enabling various downstream tasks with 13 GNN backbones when responding to flexible unlearning requests. Based on this unified benchmark framework, we are able to provide a comprehensive and fair evaluation for GU. Through extensive experimentation, we have drawn $8$ crucial conclusions about existing GU methods, while also gaining valuable insights into their limitations, shedding light on potential avenues for future research.

OpenGU: A Comprehensive Benchmark for Graph Unlearning

TL;DR

OpenGU addresses the pressing need for fair, scalable benchmarking of graph unlearning by introducing a unified platform that integrates

SOTA GU algorithms over

multi-domain datasets, enabling a

cross-task evaluation across

GNN backbones. The benchmark provides unified APIs and evaluates GU methods along three dimensions—effectiveness, efficiency, and robustness—under realistic unlearning requests and privacy-attacks such as membership inference and poisoning. Across node, edge, and graph tasks, OpenGU reveals strong performance for certain learning-based and IF-based methods, highlights challenges like memory/time bottlenecks on large graphs, and exposes robustness gaps under noise, sparsity, and varying unlearning intensities. These insights illuminate practical pathways for generalized GU frameworks, standardized forgetting metrics, and scalable, privacy-preserving graph learning in real systems.

Abstract

crucial conclusions about existing GU methods, while also gaining valuable insights into their limitations, shedding light on potential avenues for future research.

Paper Structure (26 sections, 3 equations, 8 figures, 7 tables)

This paper contains 26 sections, 3 equations, 8 figures, 7 tables.

Introduction
Definitions and Background
Graph Neural Networks
Downstream Tasks
Unlearning Requests
GU Taxonomy
Benchmark Design
Dataset Overview for OpenGU
Algorithm Framework for OpenGU
Evaluation Strategy for OpenGU
Experiments and Analyses
Reasoning Performance Comparison
Forgetting Performance Comparison
Trade-off between Forgetting and Reasoning
Algorithm Complexity Analyses
...and 11 more sections

Figures (8)

Figure 1: An overview of the OpenGU framework, illustrating the key components and methodologies involved in GU.
Figure 2: AUC-ROC ± STD comparison under MIA for node-node task with SGC backbone.
Figure 3: AUC-ROC comparison under PA for edge-edge task before and after unlearning with GraphSAGE backbone.
Figure 4: Trade-off between forgetting and reasoning on Cora, Citeseer, PubMed and in Average performance.
Figure 5: Unlearning Time Performance on Cora, PubMed, ogbn-arxiv and ogbn-products.
...and 3 more figures

OpenGU: A Comprehensive Benchmark for Graph Unlearning

TL;DR

Abstract

OpenGU: A Comprehensive Benchmark for Graph Unlearning

Authors

TL;DR

Abstract

Table of Contents

Figures (8)