Table of Contents
Fetching ...

Mjolnir: Breaking the Shield of Perturbation-Protected Gradients via Adaptive Diffusion

Xuan Liu, Siqi Cai, Qihua Zhou, Song Guo, Ruibin Li, Kaiwei Lin

TL;DR

This paper addresses the vulnerability of perturbation-based gradient protections in Federated Learning by introducing Mjölnir, a diffusion-based gradient leakage attack. It leverages a surrogate gradient data supply model and a Gradient Diffusion Model, with an adaptive diffusion parameter $M$, to denoise perturbed gradients and recover original gradients without access to the original model or external data. The key contributions are: (1) revealing the diffusion properties of gradient perturbations, (2) proposing Mjölnir as the first general gradient diffusion attack, and (3) empirically demonstrating strong gradient denoising and private data recovery across DP and non-DP perturbations for DNN/CNN models, along with ablation studies on variant configurations. The findings highlight a substantive privacy risk in gradient perturbation protections and motivate the development of defense strategies beyond perturbation-based approaches for FL privacy.

Abstract

Perturbation-based mechanisms, such as differential privacy, mitigate gradient leakage attacks by introducing noise into the gradients, thereby preventing attackers from reconstructing clients' private data from the leaked gradients. However, can gradient perturbation protection mechanisms truly defend against all gradient leakage attacks? In this paper, we present the first attempt to break the shield of gradient perturbation protection in Federated Learning for the extraction of private information. We focus on common noise distributions, specifically Gaussian and Laplace, and apply our approach to DNN and CNN models. We introduce Mjolnir, a perturbation-resilient gradient leakage attack that is capable of removing perturbations from gradients without requiring additional access to the original model structure or external data. Specifically, we leverage the inherent diffusion properties of gradient perturbation protection to develop a novel diffusion-based gradient denoising model for Mjolnir. By constructing a surrogate client model that captures the structure of perturbed gradients, we obtain crucial gradient data for training the diffusion model. We further utilize the insight that monitoring disturbance levels during the reverse diffusion process can enhance gradient denoising capabilities, allowing Mjolnir to generate gradients that closely approximate the original, unperturbed versions through adaptive sampling steps. Extensive experiments demonstrate that Mjolnir effectively recovers the protected gradients and exposes the Federated Learning process to the threat of gradient leakage, achieving superior performance in gradient denoising and private data recovery.

Mjolnir: Breaking the Shield of Perturbation-Protected Gradients via Adaptive Diffusion

TL;DR

This paper addresses the vulnerability of perturbation-based gradient protections in Federated Learning by introducing Mjölnir, a diffusion-based gradient leakage attack. It leverages a surrogate gradient data supply model and a Gradient Diffusion Model, with an adaptive diffusion parameter , to denoise perturbed gradients and recover original gradients without access to the original model or external data. The key contributions are: (1) revealing the diffusion properties of gradient perturbations, (2) proposing Mjölnir as the first general gradient diffusion attack, and (3) empirically demonstrating strong gradient denoising and private data recovery across DP and non-DP perturbations for DNN/CNN models, along with ablation studies on variant configurations. The findings highlight a substantive privacy risk in gradient perturbation protections and motivate the development of defense strategies beyond perturbation-based approaches for FL privacy.

Abstract

Perturbation-based mechanisms, such as differential privacy, mitigate gradient leakage attacks by introducing noise into the gradients, thereby preventing attackers from reconstructing clients' private data from the leaked gradients. However, can gradient perturbation protection mechanisms truly defend against all gradient leakage attacks? In this paper, we present the first attempt to break the shield of gradient perturbation protection in Federated Learning for the extraction of private information. We focus on common noise distributions, specifically Gaussian and Laplace, and apply our approach to DNN and CNN models. We introduce Mjolnir, a perturbation-resilient gradient leakage attack that is capable of removing perturbations from gradients without requiring additional access to the original model structure or external data. Specifically, we leverage the inherent diffusion properties of gradient perturbation protection to develop a novel diffusion-based gradient denoising model for Mjolnir. By constructing a surrogate client model that captures the structure of perturbed gradients, we obtain crucial gradient data for training the diffusion model. We further utilize the insight that monitoring disturbance levels during the reverse diffusion process can enhance gradient denoising capabilities, allowing Mjolnir to generate gradients that closely approximate the original, unperturbed versions through adaptive sampling steps. Extensive experiments demonstrate that Mjolnir effectively recovers the protected gradients and exposes the Federated Learning process to the threat of gradient leakage, achieving superior performance in gradient denoising and private data recovery.
Paper Structure (11 sections, 12 equations, 6 figures, 3 tables, 3 algorithms)

This paper contains 11 sections, 12 equations, 6 figures, 3 tables, 3 algorithms.

Figures (6)

  • Figure 1: Threat model. The FL training process is threatened by gradient leakage attacks, where the attacker can intercept the exchanged gradients $\nabla W$ to recover the private training data. Previous work often protects the gradients by injecting perturbation into the gradients to form $\nabla W'_N$ and $\nabla W^{Protect}$. Our Mjölnir removes the perturbation injected in the protected gradients via the adaptive diffusion process.
  • Figure 2: Mjölnir Overview. After intercepting the exchange gradients protected by unknown perturbation ($\nabla W'$) from clients, the attacker will (1) leverage the invariance of gradient data structure before and after perturbation to construct the surrogate model; (2) feed random image dataset into the surrogate model to extract surrogate clean gradients; (3) flatten the surrogate gradients and pad them to the appropriate size $g^2 = P+L$ (g is the minimum integer satisfies $g^2>L$) to create a surrogate gradient ($\nabla W^s$) dataset for training Gradient Diffusion Model (if additional conditions are chosen to guide the training process, joint ($\nabla W^s$,$\nabla W^s_{perturbed}$) or ($\nabla W^s$,$\nabla W'$) as the training dataset, where $\nabla W^s_{perturbed}$ denote surrogate gradients applied with known perturbation; if no conditions are needed, directly use $\nabla W^s$); (4) use the trained Gradient Diffusion Model to denoise $\nabla W'$ to generate the recovered gradient $\nabla W^R$. From $\nabla W^R$, the attacker can recover clients' privacy information.
  • Figure 3: Visualization of Markovian gradient diffusion process. $M$ is the noise scale of perturbation, which is set as the adaptive parameter in [M-Adaptive Process]. Mjölnir Train and Mjölnir Inference correspond to Algorithm 2 and Algorithm 3 respectively.
  • Figure 4: Comparison of the private image recovery procedures to the iterations between Mjölnir variant models and traditional Gradient Leakage Attack methods (DLG zhu2019deep).
  • Figure 5: Comparisons on ground truth clients' private images and corresponding recovered privacy images from Mjölnir variant models and commonly used traditional gradient leakage attacks. ($\delta$ = $10^{-5}$; $\varepsilon=10$; Success Rate: overall attack success rate)
  • ...and 1 more figures