Graph-Level Label-Only Membership Inference Attack against Graph Neural Networks
Jiazhu Dai, Yubing Lu
TL;DR
This work introduces GLO-MIA, the first label-only membership inference attack for graph neural networks in graph classification tasks. By perturbing the effective features of a target graph and querying the model with the perturbed graphs, GLO-MIA derives a robustness score that distinguishes training graphs from unseen graphs using only label outputs, aided by a shadow model to calibrate perturbation magnitude and thresholds. Empirically, GLO-MIA achieves up to 0.825 attack accuracy and consistently surpasses the gap-based baselines, while approaching the performance of probability-based MIAs despite lacking probability vectors. The results underscore a significant privacy risk for GNNs in realistic label-only settings and motivate future defenses and broader label-only attack strategies.
Abstract
Graph neural networks (GNNs) are widely used for graph-structured data but are vulnerable to membership inference attacks (MIAs) in graph classification tasks, which determine if a graph was part of the training dataset, potentially causing data leakage. Existing MIAs rely on prediction probability vectors, but they become ineffective when only prediction labels are available. We propose a Graph-level Label-Only Membership Inference Attack (GLO-MIA), which is based on the intuition that the target model's predictions on training data are more stable than those on testing data. GLO-MIA generates a set of perturbed graphs for target graph by adding perturbations to its effective features and queries the target model with the perturbed graphs to get their prediction labels, which are then used to calculate robustness score of the target graph. Finally, by comparing the robustness score with a predefined threshold, the membership of the target graph can be inferred correctly with high probability. Our evaluation on three datasets and four GNN models shows that GLO-MIA achieves an attack accuracy of up to 0.825, outperforming baseline work by 8.5% and closely matching the performance of probability-based MIAs, even with only prediction labels.
