Recovering Labels from Local Updates in Federated Learning
Huancheng Chen, Haris Vikalo
TL;DR
This paper tackles the vulnerability of federated learning to gradient-inversion attacks by introducing RLU, a label-recovery method that exploits correlations between local output-layer updates and data labels. By coupling an auxiliary dataset-based estimation of erroneous-confidence moments with Monte Carlo modeling of epoch dynamics, RLU can recover labels across untrained and well-trained models and under various FL schemes with multiple local epochs. The approach yields near-perfect single-epoch performance and strong robustness to data heterogeneity and different optimizers, and it also enhances gradient-inversion attacks by providing reliable labels for image reconstruction. The findings underscore significant privacy risks in FL and motivate the development of defenses against label-recovery from local updates.
Abstract
Gradient inversion (GI) attacks present a threat to the privacy of clients in federated learning (FL) by aiming to enable reconstruction of the clients' data from communicated model updates. A number of such techniques attempts to accelerate data recovery by first reconstructing labels of the samples used in local training. However, existing label extraction methods make strong assumptions that typically do not hold in realistic FL settings. In this paper we present a novel label recovery scheme, Recovering Labels from Local Updates (RLU), which provides near-perfect accuracy when attacking untrained (most vulnerable) models. More significantly, RLU achieves high performance even in realistic real-world settings where the clients in an FL system run multiple local epochs, train on heterogeneous data, and deploy various optimizers to minimize different objective functions. Specifically, RLU estimates labels by solving a least-square problem that emerges from the analysis of the correlation between labels of the data points used in a training round and the resulting update of the output layer. The experimental results on several datasets, architectures, and data heterogeneity scenarios demonstrate that the proposed method consistently outperforms existing baselines, and helps improve quality of the reconstructed images in GI attacks in terms of both PSNR and LPIPS.
