Logits Poisoning Attack in Federated Distillation
Yuhan Tang, Zhiyuan Wu, Bo Gao, Tian Wen, Yuwei Wang, Sheng Sun
TL;DR
This paper addresses the security vulnerability of Federated Distillation (FD) to logits poisoning attacks by introducing FDLA, a method that subtly manipulates the confidence distribution of logits exchanged during FD. FDLA sorts and reorders class confidences via a ranking function $r(\cdot)$ and a transformation $t(\cdot)$ to produce a contaminated logit vector $c'$, guiding local models toward incorrect yet plausible predictions without touching private data. Empirical evaluation on CIFAR-10 and SVHN under FD and FedCache shows that FDLA degrades accuracy more than baseline poisoning methods, with robustness across data heterogeneity, client count, and model architectures. The results highlight a critical need for defenses against FD-specific threats in cross-device knowledge transfer and establish a baseline for evaluating FD defenses.
Abstract
Federated Distillation (FD) is a novel and promising distributed machine learning paradigm, where knowledge distillation is leveraged to facilitate a more efficient and flexible cross-device knowledge transfer in federated learning. By optimizing local models with knowledge distillation, FD circumvents the necessity of uploading large-scale model parameters to the central server, simultaneously preserving the raw data on local clients. Despite the growing popularity of FD, there is a noticeable gap in previous works concerning the exploration of poisoning attacks within this framework. This can lead to a scant understanding of the vulnerabilities to potential adversarial actions. To this end, we introduce FDLA, a poisoning attack method tailored for FD. FDLA manipulates logit communications in FD, aiming to significantly degrade model performance on clients through misleading the discrimination of private samples. Through extensive simulation experiments across a variety of datasets, attack scenarios, and FD configurations, we demonstrate that LPA effectively compromises client model accuracy, outperforming established baseline algorithms in this regard. Our findings underscore the critical need for robust defense mechanisms in FD settings to mitigate such adversarial threats.
