Table of Contents
Fetching ...

Straggler-Resilient Differentially-Private Decentralized Learning

Yauhen Yakimenka, Chung-Wei Weng, Hsuan-Yin Lin, Eirik Rosnes, Jörg Kliewer

TL;DR

This work tackles the straggler problem in fully decentralized learning on a ring while preserving user privacy by extending network differential privacy (NDP) to account for total training latency. It introduces a skipping scheme and analyzes both a ring and a randomized ring, deriving convergence guarantees that align with the baseline results of prior SGD analyses and DP bounds via Rényi DP with privacy amplification by iteration. A key finding is that the DP leakage $\varepsilon_{\mathrm{skip}}$ scales linearly with the total number of updates (and thus with $h_{\max}$ under certain conditions), while randomizing the ring improves privacy amplification without sacrificing asymptotic convergence. Empirical validation on OpenML housing data for logistic regression and on MNIST/CIFAR-10 demonstrates practical latency reductions and a quantifiable privacy-utility trade-off, illustrating how to tune the skip timeout to balance speed, accuracy, and privacy in decentralized learning systems.

Abstract

We consider the straggler problem in decentralized learning over a logical ring while preserving user data privacy. Especially, we extend the recently proposed framework of differential privacy (DP) amplification by decentralization by Cyffers and Bellet to include overall training latency--comprising both computation and communication latency. Analytical results on both the convergence speed and the DP level are derived for both a skipping scheme (which ignores the stragglers after a timeout) and a baseline scheme that waits for each node to finish before the training continues. A trade-off between overall training latency, accuracy, and privacy, parameterized by the timeout of the skipping scheme, is identified and empirically validated for logistic regression on a real-world dataset and for image classification using the MNIST and CIFAR-10 datasets.

Straggler-Resilient Differentially-Private Decentralized Learning

TL;DR

This work tackles the straggler problem in fully decentralized learning on a ring while preserving user privacy by extending network differential privacy (NDP) to account for total training latency. It introduces a skipping scheme and analyzes both a ring and a randomized ring, deriving convergence guarantees that align with the baseline results of prior SGD analyses and DP bounds via Rényi DP with privacy amplification by iteration. A key finding is that the DP leakage scales linearly with the total number of updates (and thus with under certain conditions), while randomizing the ring improves privacy amplification without sacrificing asymptotic convergence. Empirical validation on OpenML housing data for logistic regression and on MNIST/CIFAR-10 demonstrates practical latency reductions and a quantifiable privacy-utility trade-off, illustrating how to tune the skip timeout to balance speed, accuracy, and privacy in decentralized learning systems.

Abstract

We consider the straggler problem in decentralized learning over a logical ring while preserving user data privacy. Especially, we extend the recently proposed framework of differential privacy (DP) amplification by decentralization by Cyffers and Bellet to include overall training latency--comprising both computation and communication latency. Analytical results on both the convergence speed and the DP level are derived for both a skipping scheme (which ignores the stragglers after a timeout) and a baseline scheme that waits for each node to finish before the training continues. A trade-off between overall training latency, accuracy, and privacy, parameterized by the timeout of the skipping scheme, is identified and empirically validated for logistic regression on a real-world dataset and for image classification using the MNIST and CIFAR-10 datasets.
Paper Structure (15 sections, 5 theorems, 2 equations, 4 figures, 1 algorithm)

This paper contains 15 sections, 5 theorems, 2 equations, 4 figures, 1 algorithm.

Key Result

Theorem 1

Under ass:Lipschitz, if the diameter of $\mathcal{W}$ is $d_{\mathcal{W}}$, the expected difference between the minimum value $f(\tau^*;\cdot)$ and that from alg:ss with an arbitrary learning rate parameter $\zeta > 0$ after $h_{\max}$ steps is bounded as where $\forall\, h > 0$, and $e_0 \triangleq d_{\mathcal{W}} k$, $|\lambda_1| = \frac{1-p}{\sqrt{(1+p^2) - 2p\cos(\frac{2\pi}{n})}}$ and $0 <

Figures (4)

  • Figure 1: Illustrating the $j$-th round in which node $v_i$ is a straggler.
  • Figure 2: Expected error bound (decreasing curves; \ref{['thm:convergence']}) and privacy leakage level $\varepsilon_{\mathrm{skip}}$ (increasing curves; \ref{['thm:ss_ring_all_noise', 'thm:ss_rand_ring_all_noise']}) vs average latency (\ref{['prop:latency']}) for $n=10$ (top row) and $n=500$ (bottom row). Solid lines are for a fixed ring (Skip-Ring), while dashed lines are for Skip-Rand-Ring.
  • Figure 3: Privacy leakage level $\varepsilon_{\mathrm{skip}}$ vs expected error bound for $n=10$ (top row) and $n=500$ (bottom row). Solid lines are for a fixed ring (Skip-Ring), while dashed lines are for Skip-Rand-Ring.
  • Figure :

Theorems & Definitions (10)

  • Definition 1: $k$-Lipschitz continuity
  • Definition 2: $\beta$-smoothness
  • Definition 3: NDP CyffersBellet22_1
  • Theorem 1
  • Remark 1
  • Theorem 2
  • Theorem 3
  • Remark 2
  • Lemma 1
  • Lemma 2