GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks?

Yu Sun; Gaojian Xiong; Xianxun Yao; Kailang Ma; Jian Cui

GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks?

Yu Sun, Gaojian Xiong, Xianxun Yao, Kailang Ma, Jian Cui

TL;DR

The paper addresses privacy risks in federated learning posed by gradient inversion attacks under a realistic threat model where adversaries possess only gradients and a small auxiliary dataset. It introduces GI-PIP, a gradient-inversion framework that uses an anomaly-detection-derived prior (Anomaly Score loss) together with Total Variation loss and Gradient Matching loss within a Bayesian-inspired objective to recover private data. GI-PIP demonstrates strong reconstruction quality with limited auxiliary data (e.g., 3.8% of ImageNet achieving 16.12 dB PSNR) and exhibits superior distribution generalization compared with GAN-based methods, highlighting a more realistic and potent attack vector. These findings underscore the need for robust privacy-preserving mechanisms in FL and motivate further research into defenses against gradient-inversion threats under practical data assumptions.

Abstract

Deep gradient inversion attacks expose a serious threat to Federated Learning (FL) by accurately recovering private data from shared gradients. However, the state-of-the-art heavily relies on impractical assumptions to access excessive auxiliary data, which violates the basic data partitioning principle of FL. In this paper, a novel method, Gradient Inversion Attack using Practical Image Prior (GI-PIP), is proposed under a revised threat model. GI-PIP exploits anomaly detection models to capture the underlying distribution from fewer data, while GAN-based methods consume significant more data to synthesize images. The extracted distribution is then leveraged to regulate the attack process as Anomaly Score loss. Experimental results show that GI-PIP achieves a 16.12 dB PSNR recovery using only 3.8% data of ImageNet, while GAN-based methods necessitate over 70%. Moreover, GI-PIP exhibits superior capability on distribution generalization compared to GAN-based methods. Our approach significantly alleviates the auxiliary data requirement on both amount and distribution in gradient inversion attacks, hence posing more substantial threat to real-world FL.

GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks?

TL;DR

Abstract

Paper Structure (23 sections, 5 equations, 5 figures, 2 tables)

This paper contains 23 sections, 5 equations, 5 figures, 2 tables.

Introduction
Methodology
Threat model under basic assumptions
Threat model.
Auxiliary dataset.
Gradient Inversion using Practical Image Prior
Objective function.
Anomaly Score loss.
Total Variation loss.
Gradient Matching loss.
Experiments
Experimental setup
FL task.
Data partitioning settings.
Evaluation metrics.
...and 8 more sections

Figures (5)

Figure 1: (a) Data partitioning in FL. (b) Leakage results on ImageNet using various volumes of the auxiliary dataset.
Figure 2: An overview of the GI-PIP framework. On the client side, the victim performs local training normally and transmits shared gradients to the server. On the honest-but-curious server side, under the supervision of the calculated loss, leakage from gradients can be achieved through iterative optimization of the randomly initialized dummy data.
Figure 3: Comparison of GI-PIP and existing methods. (a) Batch recovery of ImageNet. (b) Batch recovery of CIFAR10.
Figure 4: Ablation studies on AS loss. For better visual comprehension, a representative sample is chosen and shown.
Figure 5: Attack performance (with a batch size of 4 on ImageNet) vs volume of auxiliary data. The ratio of auxiliary dataset is gradually increased from $3.8\%$ to $96.2\%$.

GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks?

TL;DR

Abstract

GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks?

Authors

TL;DR

Abstract

Table of Contents

Figures (5)