Table of Contents
Fetching ...

On the Byzantine-Resilience of Distillation-Based Federated Learning

Christophe Roux, Max Zimmer, Sebastian Pokutta

TL;DR

This paper analyzes the Byzantine resilience of distillation-based federated learning (FedDistill), where clients share predictions on a public dataset instead of model parameters. It shows that prediction-space attacks are more constrained than parameter-space attacks, yielding intrinsic resilience, and provides a formal bound: if $\tilde{w}$ is a stationary point of the distorted objective, then $\|\nabla F(\tilde{w})\| \le \mathcal{O}(C^2 \alpha_{\mathrm{frac}}^2)$ and, with $L$-smooth losses, $\mathbb{E}[\|\nabla F(\bar{w}_T)\|^2] = \mathcal{O}(\varepsilon + \alpha_{\mathrm{frac}}^2)$. The authors introduce two KD-specific attacks, Loss Maximization Attack (LMA) and Class Prior Attack (CPA), which effectively disrupt training, and propose ExpGuard, a robust aggregation-based defense, along with HIPS, a framework to obfuscate attacks. Empirically, they evaluate on CIFAR-10/100, CINIC-10, and Clothing1M, showing that FedDistill is far more resilient than FedAvg under standard Byzantine attacks, while remaining vulnerable to the proposed KD-specific attacks unless defended by ExpGuard or hardened by HIPS. Overall, the work advances understanding of Byzantine risks in KD-based FL and provides concrete defense mechanisms and attack obfuscation strategies to guide future robustness research.

Abstract

Federated Learning (FL) algorithms using Knowledge Distillation (KD) have received increasing attention due to their favorable properties with respect to privacy, non-i.i.d. data and communication cost. These methods depart from transmitting model parameters and instead communicate information about a learning task by sharing predictions on a public dataset. In this work, we study the performance of such approaches in the byzantine setting, where a subset of the clients act in an adversarial manner aiming to disrupt the learning process. We show that KD-based FL algorithms are remarkably resilient and analyze how byzantine clients can influence the learning process. Based on these insights, we introduce two new byzantine attacks and demonstrate their ability to break existing byzantine-resilient methods. Additionally, we propose a novel defence method which enhances the byzantine resilience of KD-based FL algorithms. Finally, we provide a general framework to obfuscate attacks, making them significantly harder to detect, thereby improving their effectiveness. Our findings serve as an important building block in the analysis of byzantine FL, contributing through the development of new attacks and new defence mechanisms, further advancing the robustness of KD-based FL algorithms.

On the Byzantine-Resilience of Distillation-Based Federated Learning

TL;DR

This paper analyzes the Byzantine resilience of distillation-based federated learning (FedDistill), where clients share predictions on a public dataset instead of model parameters. It shows that prediction-space attacks are more constrained than parameter-space attacks, yielding intrinsic resilience, and provides a formal bound: if is a stationary point of the distorted objective, then and, with -smooth losses, . The authors introduce two KD-specific attacks, Loss Maximization Attack (LMA) and Class Prior Attack (CPA), which effectively disrupt training, and propose ExpGuard, a robust aggregation-based defense, along with HIPS, a framework to obfuscate attacks. Empirically, they evaluate on CIFAR-10/100, CINIC-10, and Clothing1M, showing that FedDistill is far more resilient than FedAvg under standard Byzantine attacks, while remaining vulnerable to the proposed KD-specific attacks unless defended by ExpGuard or hardened by HIPS. Overall, the work advances understanding of Byzantine risks in KD-based FL and provides concrete defense mechanisms and attack obfuscation strategies to guide future robustness research.

Abstract

Federated Learning (FL) algorithms using Knowledge Distillation (KD) have received increasing attention due to their favorable properties with respect to privacy, non-i.i.d. data and communication cost. These methods depart from transmitting model parameters and instead communicate information about a learning task by sharing predictions on a public dataset. In this work, we study the performance of such approaches in the byzantine setting, where a subset of the clients act in an adversarial manner aiming to disrupt the learning process. We show that KD-based FL algorithms are remarkably resilient and analyze how byzantine clients can influence the learning process. Based on these insights, we introduce two new byzantine attacks and demonstrate their ability to break existing byzantine-resilient methods. Additionally, we propose a novel defence method which enhances the byzantine resilience of KD-based FL algorithms. Finally, we provide a general framework to obfuscate attacks, making them significantly harder to detect, thereby improving their effectiveness. Our findings serve as an important building block in the analysis of byzantine FL, contributing through the development of new attacks and new defence mechanisms, further advancing the robustness of KD-based FL algorithms.
Paper Structure (40 sections, 6 theorems, 39 equations, 7 figures, 10 tables, 2 algorithms)

This paper contains 40 sections, 6 theorems, 39 equations, 7 figures, 10 tables, 2 algorithms.

Key Result

Theorem 1

If $\tilde{w}$ is a stationary point of eq:p-distill, then it is also an $\mathcal{O}(C^2{def:alphafrac}^2)$-approximate stationary point of eq:p-distill-og, where $C>0$ is a constant independent of the client predictions. Further, in expectation, running on eq:p-distill to achieve an $\varepsilon$-

Figures (7)

  • Figure 1: ResNet-18 on CINIC-10: Final test accuracy of and , varying the fraction of byzantine clients for two naive attacks.
  • Figure 2: Attack procedures for a three-class classification problem with four honest and three byzantine clients, i.e., ${\ref{['def:alphafrac']}}=3/7$. The left part of the figure shows the computation of the honest mean. \ref{['def:lma']} (upper right) assigns probability one to the least likely class based on the honest mean $\YH$ and \ref{['def:cpa']} (lower right) assigns probability one to the class that is least similar to the most likely class of $\YH$, according to the similarity matrix $S$. Note that all computations are done per sample $x$, which was omitted from the notation for legibility.
  • Figure 3: ResNet-18 on CIFAR-10: Test accuracy evolution over communication rounds when attacking with 9 byzantine out of overall 20 clients.
  • Figure 4: Attack spaces in $\Delta_3$: The blue dots represent the predictions by the honest clients and $\YH$ is their mean. The attack space is highlighted. $\color{cb-burgundy}\YB$ is the byzantine prediction, and $\color{cb-burgundy}\Yb$ denotes the mean of all clients for ${\ref{['def:alphafrac']}}=0.5$. The red line joining them represents the mean $\color{cb-burgundy}\Yb$ corresponding to different ${\ref{['def:alphafrac']}}\in [0,0.5]$.
  • Figure 5: (Left) ResNet-18 on CINIC-10 with 10 communication rounds: Final test accuracy of with different number of clients. (Right) ResNet-18 on CINIC-10 with 20 clients: Final test accuracy of with different number of communication rounds.
  • ...and 2 more figures

Theorems & Definitions (12)

  • Theorem 1: Informal
  • Lemma 2
  • proof : Proof of \ref{['lem:grad-y']}
  • Theorem 3
  • proof : Proof of \ref{['thm:approx-stat']}
  • Lemma 4
  • proof : Proof of \ref{['lem:max-sol']}
  • Lemma 5
  • Remark 6
  • proof : Proof of \ref{['lem:lma']}
  • ...and 2 more