Privacy-Preserving Graph Embedding based on Local Differential Privacy
Zening Li, Rong-Hua Li, Meihao Liao, Fusheng Jin, Guoren Wang
TL;DR
This work tackles privacy in graph embedding by introducing PrivGE, a local-differential-privacy framework that privatizes high-dimensional node features using the HDS mechanism and decouples feature transformation from graph propagation. Embeddings are learned through a personalized PageRank-based propagation, enabling robust representations while preserving privacy even under stringent budgets $\epsilon$; the approach yields improved utility bounds and practical performance. Theoretical analysis shows utilities bounds of $\max_j|\tilde{z}_{v,j}-z_{v,j}| = O(\log(d/\delta))$, tightening over bounded mechanisms, and experiments on five real-world datasets demonstrate state-of-the-art results in node classification and link prediction under LDP. The work offers a significant step toward privacy-preserving graph learning in decentralized settings, with implications for social networks and other sensitive graph-structured data.
Abstract
Graph embedding has become a powerful tool for learning latent representations of nodes in a graph. Despite its superior performance in various graph-based machine learning tasks, serious privacy concerns arise when the graph data contains personal or sensitive information. To address this issue, we investigate and develop graph embedding algorithms that satisfy local differential privacy (LDP). We introduce a novel privacy-preserving graph embedding framework, named PrivGE, to protect node data privacy. Specifically, we propose an LDP mechanism to obfuscate node data and utilize personalized PageRank as the proximity measure to learn node representations. Furthermore, we provide a theoretical analysis of the privacy guarantees and utility offered by the PrivGE framework. Extensive experiments on several real-world graph datasets demonstrate that PrivGE achieves an optimal balance between privacy and utility, and significantly outperforms existing methods in node classification and link prediction tasks.
