Table of Contents
Fetching ...

Privacy Attacks in Decentralized Learning

Abdellah El Mrini, Edwige Cyffers, Aurélien Bellet

TL;DR

This paper shows that privacy is not guaranteed in decentralized learning via gossip protocols: honest-but-curious attackers can reconstruct private data of non-neighboring nodes by exploiting the linear relationships in exchanged messages. The authors develop a reconstruction framework that builds a knowledge matrix $K_T$ from observed communications and solves a linear system $K_T X = Y_T$ to recover private inputs, with an extension to Decentralized Gradient Descent (D-GD) that reconstructs gradients first and then data via gradient-inversion as a black box. They validate the attacks on synthetic and real graphs, demonstrating substantial leakage even from a single attacker and stronger leakage with multiple attackers; graph topology, attacker position, and learning rate strongly influence success. The work argues that decentralization alone is insufficient for privacy and emphasizes the need for defenses such as differential privacy, secure aggregation, or graph-design strategies to mitigate leakage. It also sets up a foundation for auditing the privacy risk of a given gossip matrix and network structure, guiding safer deployment of decentralized learning systems.

Abstract

Decentralized Gradient Descent (D-GD) allows a set of users to perform collaborative learning without sharing their data by iteratively averaging local model updates with their neighbors in a network graph. The absence of direct communication between non-neighbor nodes might lead to the belief that users cannot infer precise information about the data of others. In this work, we demonstrate the opposite, by proposing the first attack against D-GD that enables a user (or set of users) to reconstruct the private data of other users outside their immediate neighborhood. Our approach is based on a reconstruction attack against the gossip averaging protocol, which we then extend to handle the additional challenges raised by D-GD. We validate the effectiveness of our attack on real graphs and datasets, showing that the number of users compromised by a single or a handful of attackers is often surprisingly large. We empirically investigate some of the factors that affect the performance of the attack, namely the graph topology, the number of attackers, and their position in the graph.

Privacy Attacks in Decentralized Learning

TL;DR

This paper shows that privacy is not guaranteed in decentralized learning via gossip protocols: honest-but-curious attackers can reconstruct private data of non-neighboring nodes by exploiting the linear relationships in exchanged messages. The authors develop a reconstruction framework that builds a knowledge matrix from observed communications and solves a linear system to recover private inputs, with an extension to Decentralized Gradient Descent (D-GD) that reconstructs gradients first and then data via gradient-inversion as a black box. They validate the attacks on synthetic and real graphs, demonstrating substantial leakage even from a single attacker and stronger leakage with multiple attackers; graph topology, attacker position, and learning rate strongly influence success. The work argues that decentralization alone is insufficient for privacy and emphasizes the need for defenses such as differential privacy, secure aggregation, or graph-design strategies to mitigate leakage. It also sets up a foundation for auditing the privacy risk of a given gossip matrix and network structure, guiding safer deployment of decentralized learning systems.

Abstract

Decentralized Gradient Descent (D-GD) allows a set of users to perform collaborative learning without sharing their data by iteratively averaging local model updates with their neighbors in a network graph. The absence of direct communication between non-neighbor nodes might lead to the belief that users cannot infer precise information about the data of others. In this work, we demonstrate the opposite, by proposing the first attack against D-GD that enables a user (or set of users) to reconstruct the private data of other users outside their immediate neighborhood. Our approach is based on a reconstruction attack against the gossip averaging protocol, which we then extend to handle the additional challenges raised by D-GD. We validate the effectiveness of our attack on real graphs and datasets, showing that the number of users compromised by a single or a handful of attackers is often surprisingly large. We empirically investigate some of the factors that affect the performance of the attack, namely the graph topology, the number of attackers, and their position in the graph.
Paper Structure (33 sections, 1 theorem, 9 equations, 12 figures, 3 tables, 2 algorithms)

This paper contains 33 sections, 1 theorem, 9 equations, 12 figures, 3 tables, 2 algorithms.

Key Result

Proposition 5.3

For the D-GD algorithm described by eq:gd1 and eq:gd2, we have:

Figures (12)

  • Figure 1: Overview of our attack on gossip averaging. The attackers $0$ and $1$ (red) receive updates from nodes $2$, $5$, $7$ and $8$ (orange). For $T=3$ iterations, it leads to the knowledge matrix $K_3$. Its RREF (matrix $U$) exhibits that only nodes $3$ and $4$ are non-reconstructible (green). All other nodes (purple) have their private value leaked.
  • Figure 2: Average fraction of reconstructed nodes in Erdös-Rényi graphs with a different number of nodes $n$ and edge probability $p$, for $1,2$ or $3$ attacker nodes. Error bars give the standard deviations, computed over $20$ random graphs.
  • Figure 3: Reconstruction attack on the Facebook Ego Graph $414$. Top: each node is colored by the number of nodes it can reconstruct among the $147$ other nodes. Bottom: detailed view of the case where the node circled in red is the attacker, with reconstructed nodes shown in purple and non-reconstructed ones in yellow.
  • Figure 4: Reconstruction attack on D-GD for a line graph with $31$ nodes where the attacker lies at an extremity. The first (resp. second) row shows the true (resp. reconstructed) inputs of the 30 other nodes ordered by their distance to the attacker.
  • Figure 5: Reconstruction attacks on D-GD for the Florentine graph (Cifar10, logistic regression model, learning rate $10^{-5}$). Left: the color of each node represents the success rate when that node is the attacker. The success rate is measured as the fraction of nodes for which the reconstructed image achieves a PSNR superior to $10$ with respect to the original image (averaged over 10 experiments). Right: detailed view of the case where the attacker is node 5 (highlighted with blue borders). Nodes with green borders are accurately reconstructed, the ones with red borders are not. For completeness, the true input images are shown in Appendix \ref{['app:expe_dgd']}.
  • ...and 7 more figures

Theorems & Definitions (12)

  • Remark 3.1
  • Definition 3.2: Gossip matrix
  • Remark 3.3: Accelerated gossip
  • Definition 4.1: Knowledge matrix and observation vector
  • Definition 4.2: Reconstructible node
  • Remark 4.3: Secure aggregation
  • Remark 4.4: Extension to dynamic networks
  • Remark 5.2: Connection to differential privacy
  • Proposition 5.3: Closed-form of D-GD updates
  • proof
  • ...and 2 more