Table of Contents
Fetching ...

DROP: Poison Dilution via Knowledge Distillation for Federated Learning

Georgios Syros, Anshuman Suri, Farinaz Koushanfar, Cristina Nita-Rotaru, Alina Oprea

TL;DR

DROP tackles the stealthy backdoor threat in federated learning by combining agglomerative clustering, activity monitoring, and logit-driven knowledge distillation to cleanse the global model. The method further extends with DROPlet as a lightweight variant and demonstrates robust defense across diverse learning configurations and data distributions, with near-zero ASR in many setups. While achieving strong security, DROP trades off some main task accuracy, particularly under non-IID heavy settings, and the authors release their code to enable broader evaluation and benchmarking. Overall, DROP offers a practical, configuration-robust defense for backdoor attacks in realistic FL deployments, addressing both aggressive and stealthy adversaries.

Abstract

Federated Learning is vulnerable to adversarial manipulation, where malicious clients can inject poisoned updates to influence the global model's behavior. While existing defense mechanisms have made notable progress, they fail to protect against adversaries that aim to induce targeted backdoors under different learning and attack configurations. To address this limitation, we introduce DROP (Distillation-based Reduction Of Poisoning), a novel defense mechanism that combines clustering and activity-tracking techniques with extraction of benign behavior from clients via knowledge distillation to tackle stealthy adversaries that manipulate low data poisoning rates and diverse malicious client ratios within the federation. Through extensive experimentation, our approach demonstrates superior robustness compared to existing defenses across a wide range of learning configurations. Finally, we evaluate existing defenses and our method under the challenging setting of non-IID client data distribution and highlight the challenges of designing a resilient FL defense in this setting.

DROP: Poison Dilution via Knowledge Distillation for Federated Learning

TL;DR

DROP tackles the stealthy backdoor threat in federated learning by combining agglomerative clustering, activity monitoring, and logit-driven knowledge distillation to cleanse the global model. The method further extends with DROPlet as a lightweight variant and demonstrates robust defense across diverse learning configurations and data distributions, with near-zero ASR in many setups. While achieving strong security, DROP trades off some main task accuracy, particularly under non-IID heavy settings, and the authors release their code to enable broader evaluation and benchmarking. Overall, DROP offers a practical, configuration-robust defense for backdoor attacks in realistic FL deployments, addressing both aggressive and stealthy adversaries.

Abstract

Federated Learning is vulnerable to adversarial manipulation, where malicious clients can inject poisoned updates to influence the global model's behavior. While existing defense mechanisms have made notable progress, they fail to protect against adversaries that aim to induce targeted backdoors under different learning and attack configurations. To address this limitation, we introduce DROP (Distillation-based Reduction Of Poisoning), a novel defense mechanism that combines clustering and activity-tracking techniques with extraction of benign behavior from clients via knowledge distillation to tackle stealthy adversaries that manipulate low data poisoning rates and diverse malicious client ratios within the federation. Through extensive experimentation, our approach demonstrates superior robustness compared to existing defenses across a wide range of learning configurations. Finally, we evaluate existing defenses and our method under the challenging setting of non-IID client data distribution and highlight the challenges of designing a resilient FL defense in this setting.

Paper Structure

This paper contains 29 sections, 12 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: Examples of CIFAR-10 images from the plane class with an added backdoor trigger (top-left corner). The presence of the trigger causes the model to misclassify these inputs as the horse class, illustrating the effect of a targeted backdoor attack.
  • Figure 2: Visualizing the impact of the learning configuration (learning rate, batch size, and number of local epochs) on ASR, for 1.25% DPR and 20% MCR, for CIFAR-10 with IID data. We only visualize configurations with MTA $\geq80\%$. The attack is successful on multiple configurations (in yellow).
  • Figure 3: ASR (%) for our defense (DROP) and various existing defenses for 10 FL configurations (1.25% DPR, 20% MCR) where stealthy attacks are possible. No existing defense provides consistent protection across all configurations.
  • Figure 4: Overview of the proposed DROP defense. Each round $t$ begins with the server broadcasting the global model to all clients and selecting a subset for local training, which may include both benign (green) and malicious (red) clients. After updates are submitted, DROP employs: (1) Agglomerative Clustering to detect anomalous updates, (2) Activity Monitoring & Penalization to track and penalize suspicious clients, and (3) Knowledge Distillation, where a GAN-generated synthetic dataset and client logits guide the distillation of the global model. The final model $\mathbf{w}_{t+1}$ serves as the global model for round $t+1$.
  • Figure 5: MTA (a) and ASR (b) across rounds for various defenses, for CIFAR-10 with 1.25% DPR and 20% MCR for configuration C4. Certain defenses like FLIP and FLAME have wildly fluctuating ASR across rounds, making them unreliable. DROP instead achieves consistently low ASR in all rounds.
  • ...and 1 more figures