Exposing the Vulnerability of Decentralized Learning to Membership Inference Attacks Through the Lens of Graph Mixing
Ousmane Touat, Jezekael Brunon, Yacine Belal, Julien Nicolas, César Sabater, Mohamed Maouche, Sonia Ben Mokhtar
TL;DR
This work analyzes how Membership Inference Attacks threaten decentralized, gossip-based learning and identifies two mixing-related factors—local model mixing strategy and global graph mixing properties—as core determinants of MIA vulnerability. It introduces SAMO, a Send-All-Merge-Once protocol, and demonstrates that dynamic, random peer sampling combined with SAMO markedly improves the privacy-utility tradeoff across multiple datasets, particularly when paired with differential privacy techniques. The study provides both empirical and theoretical insights, showing that dynamic topologies accelerate graph mixing (lowering the second eigenvalue of the mixing matrix) and reduce leakage, while non-i.i.d. data and early overfitting still pose significant privacy risks. The findings offer practical design guidance for privacy-aware decentralized systems, advocating dynamic topologies, stronger mixing, and careful data-heterogeneity handling to achieve safer collaborative learning in distributed environments.
Abstract
The primary promise of decentralized learning is to allow users to engage in the training of machine learning models in a collaborative manner while keeping their data on their premises and without relying on any central entity. However, this paradigm necessitates the exchange of model parameters or gradients between peers. Such exchanges can be exploited to infer sensitive information about training data, which is achieved through privacy attacks (e.g., Membership Inference Attacks -- MIA). In order to devise effective defense mechanisms, it is important to understand the factors that increase/reduce the vulnerability of a given decentralized learning architecture to MIA. In this study, we extensively explore the vulnerability to MIA of various decentralized learning architectures by varying the graph structure (e.g., number of neighbors), the graph dynamics, and the aggregation strategy, across diverse datasets and data distributions. Our key finding, which to the best of our knowledge we are the first to report, is that the vulnerability to MIA is heavily correlated to (i) the local model mixing strategy performed by each node upon reception of models from neighboring nodes and (ii) the global mixing properties of the communication graph. We illustrate these results experimentally using four datasets and by theoretically analyzing the mixing properties of various decentralized architectures. We also empirically show that enhancing mixing properties is highly beneficial when combined with other privacy-preserving techniques such as Differential Privacy. Our paper draws a set of lessons learned for devising decentralized learning systems that reduce by design the vulnerability to MIA.
