Table of Contents
Fetching ...

Differentially Private Graph Learning via Sensitivity-Bounded Personalized PageRank

Alessandro Epasto, Vahab Mirrokni, Bryan Perozzi, Anton Tsitsulin, Peilin Zhong

TL;DR

This work addresses the privacy risks of computing Personalized PageRank (PPR) on graphs by introducing a sensitivity-bounded PPR algorithm that supports edge-level and joint edge-level differential privacy. The core idea is to cap flow pushes (PushFlowCap) to bound sensitivity, enabling accurate private PPR with additive error that scales gracefully with minimum degree, and to propagate these guarantees to private graph embeddings and downstream tasks. The authors prove near-optimal sensitivity bounds and provide DP methods for PPR-based rankings and embeddings, with empirical validation on real datasets showing competitive private performance. The approach advances private graph learning by offering provable DP guarantees for PPR and practical pathways to private representations and downstream analytics with real-world impact.

Abstract

Personalized PageRank (PPR) is a fundamental tool in unsupervised learning of graph representations such as node ranking, labeling, and graph embedding. However, while data privacy is one of the most important recent concerns, existing PPR algorithms are not designed to protect user privacy. PPR is highly sensitive to the input graph edges: the difference of only one edge may cause a big change in the PPR vector, potentially leaking private user data. In this work, we propose an algorithm which outputs an approximate PPR and has provably bounded sensitivity to input edges. In addition, we prove that our algorithm achieves similar accuracy to non-private algorithms when the input graph has large degrees. Our sensitivity-bounded PPR directly implies private algorithms for several tools of graph learning, such as, differentially private (DP) PPR ranking, DP node classification, and DP node embedding. To complement our theoretical analysis, we also empirically verify the practical performances of our algorithms.

Differentially Private Graph Learning via Sensitivity-Bounded Personalized PageRank

TL;DR

This work addresses the privacy risks of computing Personalized PageRank (PPR) on graphs by introducing a sensitivity-bounded PPR algorithm that supports edge-level and joint edge-level differential privacy. The core idea is to cap flow pushes (PushFlowCap) to bound sensitivity, enabling accurate private PPR with additive error that scales gracefully with minimum degree, and to propagate these guarantees to private graph embeddings and downstream tasks. The authors prove near-optimal sensitivity bounds and provide DP methods for PPR-based rankings and embeddings, with empirical validation on real datasets showing competitive private performance. The approach advances private graph learning by offering provable DP guarantees for PPR and practical pathways to private representations and downstream analytics with real-world impact.

Abstract

Personalized PageRank (PPR) is a fundamental tool in unsupervised learning of graph representations such as node ranking, labeling, and graph embedding. However, while data privacy is one of the most important recent concerns, existing PPR algorithms are not designed to protect user privacy. PPR is highly sensitive to the input graph edges: the difference of only one edge may cause a big change in the PPR vector, potentially leaking private user data. In this work, we propose an algorithm which outputs an approximate PPR and has provably bounded sensitivity to input edges. In addition, we prove that our algorithm achieves similar accuracy to non-private algorithms when the input graph has large degrees. Our sensitivity-bounded PPR directly implies private algorithms for several tools of graph learning, such as, differentially private (DP) PPR ranking, DP node classification, and DP node embedding. To complement our theoretical analysis, we also empirically verify the practical performances of our algorithms.
Paper Structure (32 sections, 16 theorems, 9 equations, 2 figures, 1 table, 8 algorithms)

This paper contains 32 sections, 16 theorems, 9 equations, 2 figures, 1 table, 8 algorithms.

Key Result

Theorem 2.6

Consider a function $f$ whose input is a graph and whose output is in $\mathbb{R}^k$. Suppose $f$ has sensitivity $S_f$. Then the algorithm $\mathcal{A}(G)$ which outputs $f(G)+(Y_1,Y_2,\cdots,Y_k)$ is $\varepsilon$-DP where $Y_i$ are independent $\mathrm{Lap}(S_f/\varepsilon)$ random variables. Sim

Figures (2)

  • Figure 1: PPR approximation on two datasets.
  • Figure 2: Private embeddings introduced in this paper outperform other competitors and achieve much higher performance on a tight privacy budget.

Theorems & Definitions (38)

  • Definition 2.2: $\xi$-approximate PPR, e.g., andersen2007using
  • Definition 2.3: $(\xi,\eta)$-approximate PPR
  • Definition 2.4: Edge-level DP eliavs2020differentially and joint edge-level DP kearns2014mechanism
  • Definition 2.5: Sensitivity dwork2006calibrating
  • Theorem 2.6: Laplace mechanism dwork2006calibrating
  • Theorem 2.7: Composition dwork2014algorithmic
  • Lemma 3.1
  • Theorem 3.2: Sensitivity of PushFlow
  • proof
  • Lemma 4.2
  • ...and 28 more