Table of Contents
Fetching ...

Differentially Private Decentralized Learning with Random Walks

Edwige Cyffers, Aurélien Bellet, Jalaj Upadhyay

TL;DR

This work addresses privacy leakage in fully decentralized learning where model updates propagate via random walks on a graph. It introduces Pairwise Network Differential Privacy (PNDP) to capture pairwise privacy between node pairs and proposes a private random-walk SGD (RW-DP-SGD) that adds Gaussian noise at each local update. A central result is a closed-form PNDP bound for arbitrary graphs, showing that the privacy loss between nodes $u$ and $v$ after $T$ iterations depends on the graph topology through a matrix-log term, linking leakage to graph communicability. Empirical results on synthetic and real networks demonstrate that random-walk based updates can achieve better privacy-utility trade-offs than gossip algorithms, especially for closely connected nodes, with the analysis providing interpretable topology-driven guarantees that bridge local and central privacy models.

Abstract

The popularity of federated learning comes from the possibility of better scalability and the ability for participants to keep control of their data, improving data security and sovereignty. Unfortunately, sharing model updates also creates a new privacy attack surface. In this work, we characterize the privacy guarantees of decentralized learning with random walk algorithms, where a model is updated by traveling from one node to another along the edges of a communication graph. Using a recent variant of differential privacy tailored to the study of decentralized algorithms, namely Pairwise Network Differential Privacy, we derive closed-form expressions for the privacy loss between each pair of nodes where the impact of the communication topology is captured by graph theoretic quantities. Our results further reveal that random walk algorithms tends to yield better privacy guarantees than gossip algorithms for nodes close from each other. We supplement our theoretical results with empirical evaluation on synthetic and real-world graphs and datasets.

Differentially Private Decentralized Learning with Random Walks

TL;DR

This work addresses privacy leakage in fully decentralized learning where model updates propagate via random walks on a graph. It introduces Pairwise Network Differential Privacy (PNDP) to capture pairwise privacy between node pairs and proposes a private random-walk SGD (RW-DP-SGD) that adds Gaussian noise at each local update. A central result is a closed-form PNDP bound for arbitrary graphs, showing that the privacy loss between nodes and after iterations depends on the graph topology through a matrix-log term, linking leakage to graph communicability. Empirical results on synthetic and real networks demonstrate that random-walk based updates can achieve better privacy-utility trade-offs than gossip algorithms, especially for closely connected nodes, with the analysis providing interpretable topology-driven guarantees that bridge local and central privacy models.

Abstract

The popularity of federated learning comes from the possibility of better scalability and the ability for participants to keep control of their data, improving data security and sovereignty. Unfortunately, sharing model updates also creates a new privacy attack surface. In this work, we characterize the privacy guarantees of decentralized learning with random walk algorithms, where a model is updated by traveling from one node to another along the edges of a communication graph. Using a recent variant of differential privacy tailored to the study of decentralized algorithms, namely Pairwise Network Differential Privacy, we derive closed-form expressions for the privacy loss between each pair of nodes where the impact of the communication topology is captured by graph theoretic quantities. Our results further reveal that random walk algorithms tends to yield better privacy guarantees than gossip algorithms for nodes close from each other. We supplement our theoretical results with empirical evaluation on synthetic and real-world graphs and datasets.
Paper Structure (24 sections, 10 theorems, 64 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 24 sections, 10 theorems, 64 equations, 8 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Let $T^{1}, \dots, T^{K}, T'^{1}, \dots, T'^{K}$ be non-expansive operators, an initial random state $x^{0}\in U$ , and $(\zeta^{k})_ {k=1}^K$ a sequence of noise distributions. Consider the noisy iterations $x^{k+1}=T^{k+1}(x^k)+\eta^ {k+1}$ and $\bar{x}^{k+1}=T^{k+1}(\bar{x}^k)+ \bar{\eta}^{k+1}$ Then, where $*$ is the convolution of probability distributions and $\mathbf{x}$ denotes the distr

Figures (8)

  • Figure 1: Comparison of privacy loss for random walks in bold lines and gossip in dashed lines for the same synthetic graphs with $n=2048$. Random walks allow privacy amplification even for very close nodes, but the decay is slower than for gossip.
  • Figure 2: Private logistic regression on the Houses dataset where we compare our RW DP-SGD to with Local and Centralized DP-SGD as baselines.
  • Figure 3: Link between graph structure and privacy loss. Left: (a) example of Facebook Ego graph communicability and privacy loss, logarithmic scale. Middle: (b) same on the Southern women graph. Right: (c) the corresponding mean privacy loss and Katz centrality.
  • Figure 4: Comparison of privacy loss for random walks when nodes know who send them the token in bold lines and gossip in dashed lines for the same synthetic graphs with $n=2048$. Privacy amplification is visible even for close neighbors but the decay is slower than for gossip.
  • Figure 5: Stochastic Block Model with $200$ nodes in three clusters $(75,75, 50)$ and probability matrix $[[0.25, 0.05, 0.02], [0.05, 0.35, 0.07], [0.02, 0.07, 0.40]]$. The privacy loss matrix recovers the different blocks.
  • ...and 3 more figures

Theorems & Definitions (29)

  • Definition 1: Irreducibility
  • Definition 2: Aperiodicity
  • Definition 3: Stationary distribution
  • Definition 4: Transition matrix
  • Definition 5: Mixing time
  • Remark 1
  • Definition 6: Rényi Differential Privacy
  • Theorem 1: Privacy amplification by iteration, ampbyiteration
  • Definition 7: Pairwise Network DP
  • Theorem 2
  • ...and 19 more