Table of Contents
Fetching ...

Noiseless Privacy-Preserving Decentralized Learning

Sayan Biswas, Mathieu Even, Anne-Marie Kermarrec, Laurent Massoulie, Rafael Pires, Rishi Sharma, Martijn de Vos

TL;DR

This work theoretically proves the convergence of Shatter and provides a formal analysis demonstrating how Shatter reduces the efficacy of attacks compared to when exchanging full models between nodes, and evaluates the convergence and attack resilience of Shatter with existing DL algorithms, with heterogeneous datasets, and against three standard privacy attacks.

Abstract

Decentralized learning (DL) enables collaborative learning without a server and without training data leaving the users' devices. However, the models shared in DL can still be used to infer training data. Conventional defenses such as differential privacy and secure aggregation fall short in effectively safeguarding user privacy in DL, either sacrificing model utility or efficiency. We introduce Shatter, a novel DL approach in which nodes create virtual nodes (VNs) to disseminate chunks of their full model on their behalf. This enhances privacy by (i) preventing attackers from collecting full models from other nodes, and (ii) hiding the identity of the original node that produced a given model chunk. We theoretically prove the convergence of Shatter and provide a formal analysis demonstrating how Shatter reduces the efficacy of attacks compared to when exchanging full models between nodes. We evaluate the convergence and attack resilience of Shatter with existing DL algorithms, with heterogeneous datasets, and against three standard privacy attacks. Our evaluation shows that Shatter not only renders these privacy attacks infeasible when each node operates 16 VNs but also exhibits a positive impact on model utility compared to standard DL. In summary, Shatter enhances the privacy of DL while maintaining the utility and efficiency of the model.

Noiseless Privacy-Preserving Decentralized Learning

TL;DR

This work theoretically proves the convergence of Shatter and provides a formal analysis demonstrating how Shatter reduces the efficacy of attacks compared to when exchanging full models between nodes, and evaluates the convergence and attack resilience of Shatter with existing DL algorithms, with heterogeneous datasets, and against three standard privacy attacks.

Abstract

Decentralized learning (DL) enables collaborative learning without a server and without training data leaving the users' devices. However, the models shared in DL can still be used to infer training data. Conventional defenses such as differential privacy and secure aggregation fall short in effectively safeguarding user privacy in DL, either sacrificing model utility or efficiency. We introduce Shatter, a novel DL approach in which nodes create virtual nodes (VNs) to disseminate chunks of their full model on their behalf. This enhances privacy by (i) preventing attackers from collecting full models from other nodes, and (ii) hiding the identity of the original node that produced a given model chunk. We theoretically prove the convergence of Shatter and provide a formal analysis demonstrating how Shatter reduces the efficacy of attacks compared to when exchanging full models between nodes. We evaluate the convergence and attack resilience of Shatter with existing DL algorithms, with heterogeneous datasets, and against three standard privacy attacks. Our evaluation shows that Shatter not only renders these privacy attacks infeasible when each node operates 16 VNs but also exhibits a positive impact on model utility compared to standard DL. In summary, Shatter enhances the privacy of DL while maintaining the utility and efficiency of the model.
Paper Structure (61 sections, 5 theorems, 9 equations, 13 figures, 1 table)

This paper contains 61 sections, 5 theorems, 9 equations, 13 figures, 1 table.

Key Result

Theorem 1

Assume that functions $f_i$ are $L-$smooth, $f$ is lower bounded and minimized at some $\theta^\star\in\mathbb{R}^d$, the stochastic gradients are unbiased (, $\mathbb{E}\left[g_i^{(t,h)}\right] = \nabla f_i \left(\tilde{\theta}_i^{\left(t,h\right)}\right)\,\forall\,i,t,h)$ of variance upper bounded Finally, assume that $\rho<1$, where $\rho$ is defined in eq:rho in the proof of lem:contraction. T

Figures (13)

  • Figure 1: With standard DL (left), nodes continuously send their full model to other nodes in the communication topology. With Shatter, nodes operate multiple VN (middle) and VN directly communicate with other VN. In addition, each VN sends a part of the full model of the RN to other VN (right). This hides the identity of the original node that produced a given model chunk.
  • Figure 2: Selected reconstructed images (per row) using the GIA and the average LPIPS score ($\uparrow$ is more private) for all 1600 images processed in a round, when using standard DL with 100 clients (a), TopK sparsification and random model chunking (b).
  • Figure 3: The test accuracy ($\uparrow$ is better) on the left and MIA attack success ($\downarrow$ is more private) on the right for DL and Muffliato with three noise levels, using CIFAR-10 as training dataset.
  • Figure 4: The three-step workflow of Shatter, showing the operations during a single training round. A RN $N_i$ first splits its local model into $k$ chunks and sends each chunk to one of its VN (left). We refer to the $s^{th}$ chunk of RN $N_i$ as $C_{i,s}$. VN are connected into a communication topology and exchange model chunks with other VN (middle). VN forward received chunks to the corresponding RN who aggregates the received chunks into its local model and performs a training step (right).
  • Figure 5: Shatter from the perspective of RN $N_i$
  • ...and 8 more figures

Theorems & Definitions (10)

  • Theorem 1
  • Theorem 2
  • Remark 1
  • Theorem 3
  • Remark 2
  • Theorem 4
  • Remark 3
  • Remark 4
  • Lemma 5
  • Definition F.2: Mutual informationShannonInfoTheory1948