Stealing Training Graphs from Graph Neural Networks
Minhua Lin, Enyan Dai, Junjie Xu, Jinyuan Jia, Xiang Zhang, Suhang Wang
TL;DR
This work addresses the privacy risk of training graphs embedded in Graph Neural Networks by formulating a graph stealing attack that does not require access to the private data. It introduces GraphSteal, a framework that uses a graph diffusion generator (DiGress-based) to produce candidate graphs and a model-parameter-guided selection mechanism, grounded in a theoretical link between GNN parameters and training data, to recover training graphs from a released model. Through extensive experiments on molecular datasets (FreeSolv, ESOL, QM9) and across architectures (GCN, GIN, GTN), GraphSteal consistently achieves high realism, validity, and reconstruction rates, outperforming baselines and demonstrating robustness to distribution shifts and architectural variation. The results highlight a substantive privacy vulnerability in GNNs and motivate the development of stronger defenses and privacy-preserving training protocols for graph data. Overall, the study combines diffusion-based graph generation with a KKT-informed selection strategy to reveal and quantify training-graph leakage in GNNs, with implications for security, policy, and design of private graph learning systems.
Abstract
Graph Neural Networks (GNNs) have shown promising results in modeling graphs in various tasks. The training of GNNs, especially on specialized tasks such as bioinformatics, demands extensive expert annotations, which are expensive and usually contain sensitive information of data providers. The trained GNN models are often shared for deployment in the real world. As neural networks can memorize the training samples, the model parameters of GNNs have a high risk of leaking private training data. Our theoretical analysis shows the strong connections between trained GNN parameters and the training graphs used, confirming the training graph leakage issue. However, explorations into training data leakage from trained GNNs are rather limited. Therefore, we investigate a novel problem of stealing graphs from trained GNNs. To obtain high-quality graphs that resemble the target training set, a graph diffusion model with diffusion noise optimization is deployed as a graph generator. Furthermore, we propose a selection method that effectively leverages GNN model parameters to identify training graphs from samples generated by the graph diffusion model. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed framework in stealing training graphs from the trained GNN.
