Structure-Preference Enabled Graph Embedding Generation under Differential Privacy

Sen Zhang; Qingqing Ye; Haibo Hu

Structure-Preference Enabled Graph Embedding Generation under Differential Privacy

Sen Zhang, Qingqing Ye, Haibo Hu

TL;DR

This paper addresses the privacy risks of publishing graph embeddings by introducing SE-PrivGEmb, a skip-gram–based method that supports user-defined structure preferences under node-level differential privacy. It introduces a noise-tolerance mechanism that perturbs only non-zero gradient components and designs negative sampling to preserve arbitrary node proximities, achieving strong utility under DP with node-level Rényi DP guarantees. The authors provide formal analysis showing structure-preference preservation and privacy composition, and validate the approach on six real-world networks, where SE-PrivGEmb outperforms state-of-the-art private methods in structural equivalence and link prediction. The work enables customizable, privacy-preserving graph embeddings with practical DP guarantees and broad applicability to privacy-conscious graph mining tasks.

Abstract

Graph embedding generation techniques aim to learn low-dimensional vectors for each node in a graph and have recently gained increasing research attention. Publishing low-dimensional node vectors enables various graph analysis tasks, such as structural equivalence and link prediction. Yet, improper publication opens a backdoor to malicious attackers, who can infer sensitive information of individuals from the low-dimensional node vectors. Existing methods tackle this issue by developing deep graph learning models with differential privacy (DP). However, they often suffer from large noise injections and cannot provide structural preferences consistent with mining objectives. Recently, skip-gram based graph embedding generation techniques are widely used due to their ability to extract customizable structures. Based on skip-gram, we present SE-PrivGEmb, a structure-preference enabled graph embedding generation under DP. For arbitrary structure preferences, we design a unified noise tolerance mechanism via perturbing non-zero vectors. This mechanism mitigates utility degradation caused by high sensitivity. By carefully designing negative sampling probabilities in skip-gram, we theoretically demonstrate that skip-gram can preserve arbitrary proximities, which quantify structural features in graphs. Extensive experiments show that our method outperforms existing state-of-the-art methods under structural equivalence and link prediction tasks.

Structure-Preference Enabled Graph Embedding Generation under Differential Privacy

TL;DR

Abstract

Paper Structure (28 sections, 5 theorems, 21 equations, 4 figures, 6 tables, 2 algorithms)

This paper contains 28 sections, 5 theorems, 21 equations, 4 figures, 6 tables, 2 algorithms.

Introduction
Preliminaries
Graph Embedding Generation
Differential Privacy
DPSGD
Node Proximity
Problem Description and A First-cut Solution
Problem Description
A First-cut Solution
Our Proposal: SE-PrivGEmb
Noise Tolerance via Perturbing Non-zero Vectors
Theoretical Guarantee on Structure Preference
Training Algorithm
Privacy and Complexity Analysis
Privacy Analysis
...and 13 more sections

Key Result

Theorem 1

If $\mathcal{A}$ is an $(\alpha, \epsilon)$-RDP algorithm, then it also satisfies $(\epsilon+\frac{\log (1 / \delta)}{\alpha-1}, \delta)$-DP for any $\delta \in(0,1)$.

Figures (4)

Figure 1: Architecture of a Skip-gram Model. The embedding matrices $\mathbf{W}_{in}$, with a size of $|V|\times{r}$, and $\mathbf{W}_{out}^\top$, with a size of $r\times|V|$, are two model parameters that require optimization. For any node pair $(v_i, v_j)$, $\mathbf{v}_i$ represents the vector for $v_i$ in the input weight matrix $\mathbf{W}_{in}$, while $\mathbf{v}_j$ represents the vector for $v_j$ in the output weight matrix $\mathbf{W}_{out}$.
Figure 2: An Illustration of Private Update for $\mathbf{W}_{in}$. (a) represents the original input weight, (b) represents the gradient of the input weight, while (c) and (d) depict the perturbed input weight using Eq. (\ref{['eq:novNoiseGra_on_vi']}) and Eq. (\ref{['eq:nonZeroNoiseGra_on_vi']}), respectively.
Figure 3: Impact of Privacy Budget on Structural Equivalence
Figure 4: Impact of Privacy Budget on Link Prediction

Theorems & Definitions (13)

Definition 1: Edge (Node)-Level DP hay2009accurate
Definition 2: RDP mironov2017renyi
Theorem 1: RDP conversion to ($\epsilon, \delta)$-DP mironov2017renyi
Definition 3: Sensitivity dwork2006calibrating
Definition 4: Node Proximity Matrix
Definition 5: Private Graph Embedding Generation under Bounded DP
Theorem 2
Theorem 3
proof
Definition 6: Subsample wang2019subsampled
...and 3 more

Structure-Preference Enabled Graph Embedding Generation under Differential Privacy

TL;DR

Abstract

Structure-Preference Enabled Graph Embedding Generation under Differential Privacy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (13)