Table of Contents
Fetching ...

Defending Against Data Reconstruction Attacks in Federated Learning: An Information Theory Approach

Qi Tan, Qi Li, Yi Zhao, Zhuotao Liu, Xiaobing Guo, Ke Xu

TL;DR

The paper tackles privacy leakage in Federated Learning caused by Data Reconstruction Attacks (DRA) by formulating an information-theoretic channel model that ties DRA success to the mutual information $I(\mathbf{D}; \mathbf{W})$ between local data $\mathbf{D}$ and transmitted parameters $\mathbf{W}$. It proves a lower bound linking reconstruction error to MI, $\mathbb{E}[\|\mathbf{D}-\hat{\mathbf{D}}(\mathbf{W})\|^2/d] \ge \frac{e^{2h(\mathbf{D})/d}}{2\pi e} e^{-2I(\mathbf{D}; \mathbf{W})/d}$, and introduces a channel-capacity based mechanism to bound leakage across training rounds. The authors then shift defense from parameter-space manipulation to data-space protection by mapping data to a noisy version $\widetilde{\mathbf{D}} = \mathbf{D}+\boldsymbol{\xi}$ and solving $f(\sigma)=\kappa$ to bound per-round information leakage $I(\mathbf{D}; \widetilde{\mathbf{W}}_o|\mathbf{W}_i)$. They present three data-space channel implementations—Natural, White, and Personalized—showing improved utility-privacy tradeoffs and compatibility with DP, compression, and large-batch strategies, with extensive experiments across real datasets validating the approach. Overall, the work provides a principled, information-theoretic framework for defending against DRA in FL through controlled information transfer and data-space transformations, yielding practical methods that balance privacy and utility.

Abstract

Federated Learning (FL) trains a black-box and high-dimensional model among different clients by exchanging parameters instead of direct data sharing, which mitigates the privacy leak incurred by machine learning. However, FL still suffers from membership inference attacks (MIA) or data reconstruction attacks (DRA). In particular, an attacker can extract the information from local datasets by constructing DRA, which cannot be effectively throttled by existing techniques, e.g., Differential Privacy (DP). In this paper, we aim to ensure a strong privacy guarantee for FL under DRA. We prove that reconstruction errors under DRA are constrained by the information acquired by an attacker, which means that constraining the transmitted information can effectively throttle DRA. To quantify the information leakage incurred by FL, we establish a channel model, which depends on the upper bound of joint mutual information between the local dataset and multiple transmitted parameters. Moreover, the channel model indicates that the transmitted information can be constrained through data space operation, which can improve training efficiency and the model accuracy under constrained information. According to the channel model, we propose algorithms to constrain the information transmitted in a single round of local training. With a limited number of training rounds, the algorithms ensure that the total amount of transmitted information is limited. Furthermore, our channel model can be applied to various privacy-enhancing techniques (such as DP) to enhance privacy guarantees against DRA. Extensive experiments with real-world datasets validate the effectiveness of our methods.

Defending Against Data Reconstruction Attacks in Federated Learning: An Information Theory Approach

TL;DR

The paper tackles privacy leakage in Federated Learning caused by Data Reconstruction Attacks (DRA) by formulating an information-theoretic channel model that ties DRA success to the mutual information between local data and transmitted parameters . It proves a lower bound linking reconstruction error to MI, , and introduces a channel-capacity based mechanism to bound leakage across training rounds. The authors then shift defense from parameter-space manipulation to data-space protection by mapping data to a noisy version and solving to bound per-round information leakage . They present three data-space channel implementations—Natural, White, and Personalized—showing improved utility-privacy tradeoffs and compatibility with DP, compression, and large-batch strategies, with extensive experiments across real datasets validating the approach. Overall, the work provides a principled, information-theoretic framework for defending against DRA in FL through controlled information transfer and data-space transformations, yielding practical methods that balance privacy and utility.

Abstract

Federated Learning (FL) trains a black-box and high-dimensional model among different clients by exchanging parameters instead of direct data sharing, which mitigates the privacy leak incurred by machine learning. However, FL still suffers from membership inference attacks (MIA) or data reconstruction attacks (DRA). In particular, an attacker can extract the information from local datasets by constructing DRA, which cannot be effectively throttled by existing techniques, e.g., Differential Privacy (DP). In this paper, we aim to ensure a strong privacy guarantee for FL under DRA. We prove that reconstruction errors under DRA are constrained by the information acquired by an attacker, which means that constraining the transmitted information can effectively throttle DRA. To quantify the information leakage incurred by FL, we establish a channel model, which depends on the upper bound of joint mutual information between the local dataset and multiple transmitted parameters. Moreover, the channel model indicates that the transmitted information can be constrained through data space operation, which can improve training efficiency and the model accuracy under constrained information. According to the channel model, we propose algorithms to constrain the information transmitted in a single round of local training. With a limited number of training rounds, the algorithms ensure that the total amount of transmitted information is limited. Furthermore, our channel model can be applied to various privacy-enhancing techniques (such as DP) to enhance privacy guarantees against DRA. Extensive experiments with real-world datasets validate the effectiveness of our methods.
Paper Structure (33 sections, 7 theorems, 53 equations, 33 figures, 2 tables)

This paper contains 33 sections, 7 theorems, 53 equations, 33 figures, 2 tables.

Key Result

Theorem 1

For any random variable ${\bm{D}}$, ${\bm{D}} \in \mathbb{R}^{d}$ and ${\bm{W}}$, ${\bm{W}} \in \mathbb{R}^{m}$, we have where $\hat{{\bm{D}}}({\bm{W}})$ is an estimator of ${\bm{D}}$ constructed by ${\bm{W}}$.

Figures (33)

  • Figure 1: To enhance the capability for defending against DRA in FL, we develop techniques to constrain the amount of transmitted information below a certain threshold $\kappa$.
  • Figure 2: The process of FL can be unfolded to a time-dependent Markov chain. Hence we can analyze the mutual information in a round from ${\bm{W}}^{(t)}_{i}$ to ${\bm{W}}^{(t+1)}_{i}$.
  • Figure 3: The channel capacity $C^{(t)}$ is the maximum MI increment at round $t$. It is an increasing function of $\lambda^{(t)}$ and a decreasing function of $\sigma^{(t)}$.
  • Figure 4: A toy example to explain the rationale for constraining in data space, which is equivalent to adding an adaptive noise to the parameter.
  • Figure 5: Visualizations for CelebA when we apply different channel implementations (Natural, White, and Personalized) and utilize different channel capacities.
  • ...and 28 more figures

Theorems & Definitions (14)

  • Theorem 1: Lower bound for reconstruction error
  • Lemma 1: Maximum entropy distribution
  • Theorem 2: Channel capacity
  • Theorem 3: Upper bound of Personalized Channel.
  • Theorem 4: Channel capacity for DP
  • Lemma 2
  • proof
  • proof
  • proof
  • proof
  • ...and 4 more