GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks?
Yu Sun, Gaojian Xiong, Xianxun Yao, Kailang Ma, Jian Cui
TL;DR
The paper addresses privacy risks in federated learning posed by gradient inversion attacks under a realistic threat model where adversaries possess only gradients and a small auxiliary dataset. It introduces GI-PIP, a gradient-inversion framework that uses an anomaly-detection-derived prior (Anomaly Score loss) together with Total Variation loss and Gradient Matching loss within a Bayesian-inspired objective to recover private data. GI-PIP demonstrates strong reconstruction quality with limited auxiliary data (e.g., 3.8% of ImageNet achieving 16.12 dB PSNR) and exhibits superior distribution generalization compared with GAN-based methods, highlighting a more realistic and potent attack vector. These findings underscore the need for robust privacy-preserving mechanisms in FL and motivate further research into defenses against gradient-inversion threats under practical data assumptions.
Abstract
Deep gradient inversion attacks expose a serious threat to Federated Learning (FL) by accurately recovering private data from shared gradients. However, the state-of-the-art heavily relies on impractical assumptions to access excessive auxiliary data, which violates the basic data partitioning principle of FL. In this paper, a novel method, Gradient Inversion Attack using Practical Image Prior (GI-PIP), is proposed under a revised threat model. GI-PIP exploits anomaly detection models to capture the underlying distribution from fewer data, while GAN-based methods consume significant more data to synthesize images. The extracted distribution is then leveraged to regulate the attack process as Anomaly Score loss. Experimental results show that GI-PIP achieves a 16.12 dB PSNR recovery using only 3.8% data of ImageNet, while GAN-based methods necessitate over 70%. Moreover, GI-PIP exhibits superior capability on distribution generalization compared to GAN-based methods. Our approach significantly alleviates the auxiliary data requirement on both amount and distribution in gradient inversion attacks, hence posing more substantial threat to real-world FL.
