On the Byzantine-Resilience of Distillation-Based Federated Learning
Christophe Roux, Max Zimmer, Sebastian Pokutta
TL;DR
This paper analyzes the Byzantine resilience of distillation-based federated learning (FedDistill), where clients share predictions on a public dataset instead of model parameters. It shows that prediction-space attacks are more constrained than parameter-space attacks, yielding intrinsic resilience, and provides a formal bound: if $\tilde{w}$ is a stationary point of the distorted objective, then $\|\nabla F(\tilde{w})\| \le \mathcal{O}(C^2 \alpha_{\mathrm{frac}}^2)$ and, with $L$-smooth losses, $\mathbb{E}[\|\nabla F(\bar{w}_T)\|^2] = \mathcal{O}(\varepsilon + \alpha_{\mathrm{frac}}^2)$. The authors introduce two KD-specific attacks, Loss Maximization Attack (LMA) and Class Prior Attack (CPA), which effectively disrupt training, and propose ExpGuard, a robust aggregation-based defense, along with HIPS, a framework to obfuscate attacks. Empirically, they evaluate on CIFAR-10/100, CINIC-10, and Clothing1M, showing that FedDistill is far more resilient than FedAvg under standard Byzantine attacks, while remaining vulnerable to the proposed KD-specific attacks unless defended by ExpGuard or hardened by HIPS. Overall, the work advances understanding of Byzantine risks in KD-based FL and provides concrete defense mechanisms and attack obfuscation strategies to guide future robustness research.
Abstract
Federated Learning (FL) algorithms using Knowledge Distillation (KD) have received increasing attention due to their favorable properties with respect to privacy, non-i.i.d. data and communication cost. These methods depart from transmitting model parameters and instead communicate information about a learning task by sharing predictions on a public dataset. In this work, we study the performance of such approaches in the byzantine setting, where a subset of the clients act in an adversarial manner aiming to disrupt the learning process. We show that KD-based FL algorithms are remarkably resilient and analyze how byzantine clients can influence the learning process. Based on these insights, we introduce two new byzantine attacks and demonstrate their ability to break existing byzantine-resilient methods. Additionally, we propose a novel defence method which enhances the byzantine resilience of KD-based FL algorithms. Finally, we provide a general framework to obfuscate attacks, making them significantly harder to detect, thereby improving their effectiveness. Our findings serve as an important building block in the analysis of byzantine FL, contributing through the development of new attacks and new defence mechanisms, further advancing the robustness of KD-based FL algorithms.
