Node-level Contrastive Unlearning on Graph Neural Networks
Hong kyu Lee, Qiuchen Zhang, Carl Yang, Li Xiong
TL;DR
Graph unlearning in GNNs is challenged by mutual dependencies among nodes. Node-CUL addresses this by operating in the embedding space with two components: node representation unlearning via a contrastive loss that pushes unlearning nodes toward unseen regions and away from same-class neighbors, and neighborhood reconstruction to mitigate utility loss by re-aligning neighbors with their remaining connections. Empirical results across multiple datasets and GNN architectures show Node-CUL achieving superior unlearn efficacy (low unlearn score) while preserving or improving task accuracy compared with baselines like GNNDelete and GIF, and with privacy protection comparable to fully retrained models via LiRA. The method is model-agnostic and scalable, with a principled termination condition to stop unlearning, and future work will extend to edge unlearning and joint node-edge unlearning scenarios.
Abstract
Graph unlearning aims to remove a subset of graph entities (i.e. nodes and edges) from a graph neural network (GNN) trained on the graph. Unlike machine unlearning for models trained on Euclidean-structured data, effectively unlearning a model trained on non-Euclidean-structured data, such as graphs, is challenging because graph entities exhibit mutual dependencies. Existing works utilize graph partitioning, influence function, or additional layers to achieve graph unlearning. However, none of them can achieve high scalability and effectiveness without additional constraints. In this paper, we achieve more effective graph unlearning by utilizing the embedding space. The primary training objective of a GNN is to generate proper embeddings for each node that encapsulates both structural information and node feature representations. Thus, directly optimizing the embedding space can effectively remove the target nodes' information from the model. Based on this intuition, we propose node-level contrastive unlearning (Node-CUL). It removes the influence of the target nodes (unlearning nodes) by contrasting the embeddings of remaining nodes and neighbors of unlearning nodes. Through iterative updates, the embeddings of unlearning nodes gradually become similar to those of unseen nodes, effectively removing the learned information without directly incorporating unseen data. In addition, we introduce a neighborhood reconstruction method that optimizes the embeddings of the neighbors in order to remove influence of unlearning nodes to maintain the utility of the GNN model. Experiments on various graph data and models show that our Node-CUL achieves the best unlearn efficacy and enhanced model utility with requiring comparable computing resources with existing frameworks.
