Provable Privacy Advantages of Decentralized Federated Learning via Distributed Optimization

Wenrui Yu; Qiongxiu Li; Milan Lopuhaä-Zwakenberg; Mads Græsbøll Christensen; Richard Heusdens

Provable Privacy Advantages of Decentralized Federated Learning via Distributed Optimization

Wenrui Yu, Qiongxiu Li, Milan Lopuhaä-Zwakenberg, Mads Græsbøll Christensen, Richard Heusdens

TL;DR

This work analyzes privacy in centralized versus decentralized federated learning through both information-theoretic bounds and empirical privacy attacks. It introduces mutual-information-based bounds showing that privacy leakage in decentralized FL under distributed optimization is never larger than in centralized FL, and shows how noise in auxiliary variables can further reduce leakage. The authors validate their theory with logistic regression and deep neural networks, demonstrating that DFL often offers lower privacy risk, especially for complex models and larger honest components, though simpler models may yield comparable leakage. Across experiments with gradient inversion and membership inference attacks, CFL generally leaks more private information than DFL, and the privacy gap narrows as more nodes become corrupt. The results emphasize the practical privacy benefits of distributed optimization-based DFL, suggesting targeted deployments where decentralization improves privacy against iterative-attacks while maintaining convergence performance.

Abstract

Federated learning (FL) emerged as a paradigm designed to improve data privacy by enabling data to reside at its source, thus embedding privacy as a core consideration in FL architectures, whether centralized or decentralized. Contrasting with recent findings by Pasquini et al., which suggest that decentralized FL does not empirically offer any additional privacy or security benefits over centralized models, our study provides compelling evidence to the contrary. We demonstrate that decentralized FL, when deploying distributed optimization, provides enhanced privacy protection - both theoretically and empirically - compared to centralized approaches. The challenge of quantifying privacy loss through iterative processes has traditionally constrained the theoretical exploration of FL protocols. We overcome this by conducting a pioneering in-depth information-theoretical privacy analysis for both frameworks. Our analysis, considering both eavesdropping and passive adversary models, successfully establishes bounds on privacy leakage. We show information theoretically that the privacy loss in decentralized FL is upper bounded by the loss in centralized FL. Compared to the centralized case where local gradients of individual participants are directly revealed, a key distinction of optimization-based decentralized FL is that the relevant information includes differences of local gradients over successive iterations and the aggregated sum of different nodes' gradients over the network. This information complicates the adversary's attempt to infer private data. To bridge our theoretical insights with practical applications, we present detailed case studies involving logistic regression and deep neural networks. These examples demonstrate that while privacy leakage remains comparable in simpler models, complex models like deep neural networks exhibit lower privacy risks under decentralized FL.

Provable Privacy Advantages of Decentralized Federated Learning via Distributed Optimization

TL;DR

Abstract

Paper Structure (46 sections, 4 theorems, 42 equations, 14 figures, 1 algorithm)

This paper contains 46 sections, 4 theorems, 42 equations, 14 figures, 1 algorithm.

Introduction
Paper contribution
Outline and Notation
Preliminaries
Centralized FL
Decentralized FL
Average consensus-based approaches
Distributed optimization-based approaches
Distributed optimizers
Threat models
Privacy evaluation
Information-theoretical privacy metric
Fundamentals of mutual information
Empirical evaluation via privacy attacks
Gradient inversion attack
...and 31 more sections

Key Result

Proposition 1

Let $\mathcal{G}_h=({\mathcal{V}}_h,\mathcal{E}_h)$ be the subgraph of $\mathcal{G}$ after eliminating all corrupt nodes. Let $\mathcal{G}_{ h,1},\ldots,\mathcal{G}_{ h,k_h}$ denote the components of $\mathcal{G}_h$ and let ${\mathcal{V}}_{ h,k}$ be the vertex set of $\mathcal{G}_{ h,k}$. Without lo

Figures (14)

Figure 1: Two topologies in federated learning
Figure 2: Privacy comparisons of centralized and decentralized logistic regression. (a) Training loss and (b) Reconstruction error of input data as a function of iteration number $(t)$ using CFL (blue color) and DFL (red color).
Figure 3: (a) Averaged SSIM of reconstructed inputs by inverting noisy gradients, gradient differences, and the gradient sum (blue lines) and test accuracy (red line) for different variances of initialized auxiliary variable $\boldsymbol{z}^{(0)}$: $\sigma^2_Z=0,10^{-8},10^{-7},2.5\times10^{-7},10^{-6},10^{-5},2.5\times10^{-5}$ and $10^{-4}$. (b) Sample examples of reconstructed inputs for each case.
Figure 4: Performance of reconstructed inputs via inverting gradients (CFL) and gradient differences (DFL) in terms of iterations $t$: (a) Averaged SSIM (solid lines) of all reconstructed inputs along with the corresponding standard derivation (shadows), (b) sample examples of reconstructed inputs at iteration number $t=1, 100,\ldots, 900$.
Figure 5: Performance comparisons of CFL and DFL via inverting inputs from gradient differences: (a) Samples images of ground truth and reconstructed inputs, (b) SSIM comparisons of all reconstructed inputs for different batch size $n_i=1,2,4,8$ using two datasets MNIST (top) and CIFAR-10 (bottom), respectively.
...and 9 more figures

Theorems & Definitions (13)

Remark 1
Proposition 1
proof
Theorem 1: Privacy bounds of DFL
proof
Remark 2
Corollary 1
proof
Remark 3
Remark 4
...and 3 more

Provable Privacy Advantages of Decentralized Federated Learning via Distributed Optimization

TL;DR

Abstract

Provable Privacy Advantages of Decentralized Federated Learning via Distributed Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (13)