Table of Contents
Fetching ...

GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks?

Yu Sun, Gaojian Xiong, Xianxun Yao, Kailang Ma, Jian Cui

TL;DR

The paper addresses privacy risks in federated learning posed by gradient inversion attacks under a realistic threat model where adversaries possess only gradients and a small auxiliary dataset. It introduces GI-PIP, a gradient-inversion framework that uses an anomaly-detection-derived prior (Anomaly Score loss) together with Total Variation loss and Gradient Matching loss within a Bayesian-inspired objective to recover private data. GI-PIP demonstrates strong reconstruction quality with limited auxiliary data (e.g., 3.8% of ImageNet achieving 16.12 dB PSNR) and exhibits superior distribution generalization compared with GAN-based methods, highlighting a more realistic and potent attack vector. These findings underscore the need for robust privacy-preserving mechanisms in FL and motivate further research into defenses against gradient-inversion threats under practical data assumptions.

Abstract

Deep gradient inversion attacks expose a serious threat to Federated Learning (FL) by accurately recovering private data from shared gradients. However, the state-of-the-art heavily relies on impractical assumptions to access excessive auxiliary data, which violates the basic data partitioning principle of FL. In this paper, a novel method, Gradient Inversion Attack using Practical Image Prior (GI-PIP), is proposed under a revised threat model. GI-PIP exploits anomaly detection models to capture the underlying distribution from fewer data, while GAN-based methods consume significant more data to synthesize images. The extracted distribution is then leveraged to regulate the attack process as Anomaly Score loss. Experimental results show that GI-PIP achieves a 16.12 dB PSNR recovery using only 3.8% data of ImageNet, while GAN-based methods necessitate over 70%. Moreover, GI-PIP exhibits superior capability on distribution generalization compared to GAN-based methods. Our approach significantly alleviates the auxiliary data requirement on both amount and distribution in gradient inversion attacks, hence posing more substantial threat to real-world FL.

GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks?

TL;DR

The paper addresses privacy risks in federated learning posed by gradient inversion attacks under a realistic threat model where adversaries possess only gradients and a small auxiliary dataset. It introduces GI-PIP, a gradient-inversion framework that uses an anomaly-detection-derived prior (Anomaly Score loss) together with Total Variation loss and Gradient Matching loss within a Bayesian-inspired objective to recover private data. GI-PIP demonstrates strong reconstruction quality with limited auxiliary data (e.g., 3.8% of ImageNet achieving 16.12 dB PSNR) and exhibits superior distribution generalization compared with GAN-based methods, highlighting a more realistic and potent attack vector. These findings underscore the need for robust privacy-preserving mechanisms in FL and motivate further research into defenses against gradient-inversion threats under practical data assumptions.

Abstract

Deep gradient inversion attacks expose a serious threat to Federated Learning (FL) by accurately recovering private data from shared gradients. However, the state-of-the-art heavily relies on impractical assumptions to access excessive auxiliary data, which violates the basic data partitioning principle of FL. In this paper, a novel method, Gradient Inversion Attack using Practical Image Prior (GI-PIP), is proposed under a revised threat model. GI-PIP exploits anomaly detection models to capture the underlying distribution from fewer data, while GAN-based methods consume significant more data to synthesize images. The extracted distribution is then leveraged to regulate the attack process as Anomaly Score loss. Experimental results show that GI-PIP achieves a 16.12 dB PSNR recovery using only 3.8% data of ImageNet, while GAN-based methods necessitate over 70%. Moreover, GI-PIP exhibits superior capability on distribution generalization compared to GAN-based methods. Our approach significantly alleviates the auxiliary data requirement on both amount and distribution in gradient inversion attacks, hence posing more substantial threat to real-world FL.
Paper Structure (23 sections, 5 equations, 5 figures, 2 tables)

This paper contains 23 sections, 5 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: (a) Data partitioning in FL. (b) Leakage results on ImageNet using various volumes of the auxiliary dataset.
  • Figure 2: An overview of the GI-PIP framework. On the client side, the victim performs local training normally and transmits shared gradients to the server. On the honest-but-curious server side, under the supervision of the calculated loss, leakage from gradients can be achieved through iterative optimization of the randomly initialized dummy data.
  • Figure 3: Comparison of GI-PIP and existing methods. (a) Batch recovery of ImageNet. (b) Batch recovery of CIFAR10.
  • Figure 4: Ablation studies on AS loss. For better visual comprehension, a representative sample is chosen and shown.
  • Figure 5: Attack performance (with a batch size of 4 on ImageNet) vs volume of auxiliary data. The ratio of auxiliary dataset is gradually increased from $3.8\%$ to $96.2\%$.