Table of Contents
Fetching ...

Re-Evaluating Privacy in Centralized and Decentralized Learning: An Information-Theoretical and Empirical Study

Changlong Ji, Stephane Maag, Richard Heusdens, Qiongxiu Li

TL;DR

This study offers a novel perspective by conducting a rigorous information-theoretical analysis of privacy leakage in FL using mutual information, and shows that DFL generally offers stronger privacy preservation than CFL in practical scenarios where a fully trusted server is not available.

Abstract

Decentralized Federated Learning (DFL) has garnered attention for its robustness and scalability compared to Centralized Federated Learning (CFL). While DFL is commonly believed to offer privacy advantages due to the decentralized control of sensitive data, recent work by Pasquini et, al. challenges this view, demonstrating that DFL does not inherently improve privacy against empirical attacks under certain assumptions. For investigating fully this issue, a formal theoretical framework is required. Our study offers a novel perspective by conducting a rigorous information-theoretical analysis of privacy leakage in FL using mutual information. We further investigate the effectiveness of privacy-enhancing techniques like Secure Aggregation (SA) in both CFL and DFL. Our simulations and real-world experiments show that DFL generally offers stronger privacy preservation than CFL in practical scenarios where a fully trusted server is not available. We address discrepancies in previous research by highlighting limitations in their assumptions about graph topology and privacy attacks, which inadequately capture information leakage in FL.

Re-Evaluating Privacy in Centralized and Decentralized Learning: An Information-Theoretical and Empirical Study

TL;DR

This study offers a novel perspective by conducting a rigorous information-theoretical analysis of privacy leakage in FL using mutual information, and shows that DFL generally offers stronger privacy preservation than CFL in practical scenarios where a fully trusted server is not available.

Abstract

Decentralized Federated Learning (DFL) has garnered attention for its robustness and scalability compared to Centralized Federated Learning (CFL). While DFL is commonly believed to offer privacy advantages due to the decentralized control of sensitive data, recent work by Pasquini et, al. challenges this view, demonstrating that DFL does not inherently improve privacy against empirical attacks under certain assumptions. For investigating fully this issue, a formal theoretical framework is required. Our study offers a novel perspective by conducting a rigorous information-theoretical analysis of privacy leakage in FL using mutual information. We further investigate the effectiveness of privacy-enhancing techniques like Secure Aggregation (SA) in both CFL and DFL. Our simulations and real-world experiments show that DFL generally offers stronger privacy preservation than CFL in practical scenarios where a fully trusted server is not available. We address discrepancies in previous research by highlighting limitations in their assumptions about graph topology and privacy attacks, which inadequately capture information leakage in FL.
Paper Structure (14 sections, 1 theorem, 10 equations, 4 figures)

This paper contains 14 sections, 1 theorem, 10 equations, 4 figures.

Key Result

Proposition 1

where the first and third equality hold if and only if the underlying graph is fully connected; the second inequality holds for all connected graphs with more than two nodes.

Figures (4)

  • Figure 1: Relative mutual information ${I_{\text{FL}}^{\text{mode}}}/{I_{\text{CFL}}}$ as a function of number of nodes $n$ in the network for CFL w/o SA, CFL w/ SA, DFL w/o SA and DFL w/ SA, wherein three different network densities for DFL are considered.
  • Figure 2: Samples images of ground truth and reconstructed inputs by inverting gradients on four different protocols of three datasets: (a) CIFAR-10, (b) CIFAR-100 and (c) MNIST. (d) Averaged SSIM values of all reconstructed inputs in each dataset for four modes.
  • Figure 3: SSIM values of all reconstructed inputs across three network densities $0.4, 0.8$ and $1$ for DFL cases, alongside both CFL w/ and w/o SA cases.
  • Figure 4: ROCs and attack success rates of two MIAs for CFL and DFL w/o SA cases. The AUC values represent the attack success rates.

Theorems & Definitions (2)

  • Proposition 1
  • proof