AdvSGM: Differentially Private Graph Learning via Adversarial Skip-gram Model
Sen Zhang, Qingqing Ye, Haibo Hu, Jianliang Xu
TL;DR
Adv SG M addresses the privacy risks of graph skip-gram embeddings by introducing a differential privacy framework for graphs that leverages adversarial training. The core idea is to privatize the skip-gram via two optimizable noise terms embedded in the adversarial module and to achieve gradient perturbation by carefully tuning the relative weights between the skip-gram and adversarial components, ensuring node-level $(\epsilon,\delta)$-DP through post-processing. The authors characterize privacy with Rényi differential privacy and subsampling amplification, derive a practical training algorithm with complexity that scales linearly with batch sizes, and prove DP guarantees for the discriminator which transfer to the generator. Empirically, AdvSGM outperforms state-of-the-art private graph embeddings on link prediction and node clustering across six real-world datasets, especially at moderate privacy budgets, demonstrating a favorable privacy-utility trade-off that enables private graph representations for downstream tasks.
Abstract
The skip-gram model (SGM), which employs a neural network to generate node vectors, serves as the basis for numerous popular graph embedding techniques. However, since the training datasets contain sensitive linkage information, the parameters of a released SGM may encode private information and pose significant privacy risks. Differential privacy (DP) is a rigorous standard for protecting individual privacy in data analysis. Nevertheless, when applying differential privacy to skip-gram in graphs, it becomes highly challenging due to the complex link relationships, which potentially result in high sensitivity and necessitate substantial noise injection. To tackle this challenge, we present AdvSGM, a differentially private skip-gram for graphs via adversarial training. Our core idea is to leverage adversarial training to privatize skip-gram while improving its utility. Towards this end, we develop a novel adversarial training module by devising two optimizable noise terms that correspond to the parameters of a skip-gram. By fine-tuning the weights between modules within AdvSGM, we can achieve differentially private gradient updates without additional noise injection. Extensive experimental results on six real-world graph datasets show that AdvSGM preserves high data utility across different downstream tasks.
