Table of Contents
Fetching ...

HYDRA-FL: Hybrid Knowledge Distillation for Robust and Accurate Federated Learning

Momin Ahmad Khan, Yasra Chandio, Fatima Muhammad Anwar

TL;DR

A novel algorithm is introduced, Hybrid Knowledge Distillation for Robust and Accurate FL (HYDRA-FL), which reduces the impact of attacks in attack scenarios by offloading some of the KD loss to a shallow layer via an auxiliary classifier.

Abstract

Data heterogeneity among Federated Learning (FL) users poses a significant challenge, resulting in reduced global model performance. The community has designed various techniques to tackle this issue, among which Knowledge Distillation (KD)-based techniques are common. While these techniques effectively improve performance under high heterogeneity, they inadvertently cause higher accuracy degradation under model poisoning attacks (known as attack amplification). This paper presents a case study to reveal this critical vulnerability in KD-based FL systems. We show why KD causes this issue through empirical evidence and use it as motivation to design a hybrid distillation technique. We introduce a novel algorithm, Hybrid Knowledge Distillation for Robust and Accurate FL (HYDRA-FL), which reduces the impact of attacks in attack scenarios by offloading some of the KD loss to a shallow layer via an auxiliary classifier. We model HYDRA-FL as a generic framework and adapt it to two KD-based FL algorithms, FedNTD and MOON. Using these two as case studies, we demonstrate that our technique outperforms baselines in attack settings while maintaining comparable performance in benign settings.

HYDRA-FL: Hybrid Knowledge Distillation for Robust and Accurate Federated Learning

TL;DR

A novel algorithm is introduced, Hybrid Knowledge Distillation for Robust and Accurate FL (HYDRA-FL), which reduces the impact of attacks in attack scenarios by offloading some of the KD loss to a shallow layer via an auxiliary classifier.

Abstract

Data heterogeneity among Federated Learning (FL) users poses a significant challenge, resulting in reduced global model performance. The community has designed various techniques to tackle this issue, among which Knowledge Distillation (KD)-based techniques are common. While these techniques effectively improve performance under high heterogeneity, they inadvertently cause higher accuracy degradation under model poisoning attacks (known as attack amplification). This paper presents a case study to reveal this critical vulnerability in KD-based FL systems. We show why KD causes this issue through empirical evidence and use it as motivation to design a hybrid distillation technique. We introduce a novel algorithm, Hybrid Knowledge Distillation for Robust and Accurate FL (HYDRA-FL), which reduces the impact of attacks in attack scenarios by offloading some of the KD loss to a shallow layer via an auxiliary classifier. We model HYDRA-FL as a generic framework and adapt it to two KD-based FL algorithms, FedNTD and MOON. Using these two as case studies, we demonstrate that our technique outperforms baselines in attack settings while maintaining comparable performance in benign settings.
Paper Structure (31 sections, 12 equations, 7 figures, 4 tables)

This paper contains 31 sections, 12 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Overview of attack amplification through knowledge distillation. a) In the benign setting, KD reduces drift and brings benign local models closer to the benign global model. b) In the malicious setting, KD unknowingly reduces drift between benign local models and the poisoned global model.
  • Figure 2: Impact of increasing KL-divergence loss for FedNTD and contrastive loss for MOON on accuracy.
  • Figure 3: Impact of the heterogeneity parameter, $\alpha$ in benign and adversarial settings. We use the Dirichlet distribution where a higher $\alpha$ means lower heterogeneity.
  • Figure 4: HYDRA-FL framework: we refine client model training by reducing the final layer's KD-loss and incorporating shallow KD-loss at an earlier shallow layer via an auxiliary classifier.
  • Figure 5: HYDRA-FL vs. MOON and FedAvg when auxiliary classifiers are placed at different shallow layers.
  • ...and 2 more figures