FedDefender: Backdoor Attack Defense in Federated Learning
Waris Gill, Ali Anwar, Muhammad Ali Gulzar
TL;DR
This paper addresses backdoor attacks in Federated Learning by introducing FedDefender, a defense that uses differential testing to fingerprint neuron activations of client models on randomly generated inputs. At the central server, FedDefender computes a malicious-confidence score for each client and reduces a suspect client's contribution when the score surpasses a threshold, effectively altering the aggregation from FedAvg to exclude or diminish malicious updates. Empirically, FedDefender reduces the Attack Success Rate (ASR) to roughly 10% on MNIST and FashionMNIST with 20 or 30 clients while maintaining global accuracy, outperforming both FedAvg and NormClipping. The authors provide a public artifact and argue that the approach is protocol-agnostic and requires no access to raw client data, illustrating a practical defense inspired by software testing principles for FL.
Abstract
Federated Learning (FL) is a privacy-preserving distributed machine learning technique that enables individual clients (e.g., user participants, edge devices, or organizations) to train a model on their local data in a secure environment and then share the trained model with an aggregator to build a global model collaboratively. In this work, we propose FedDefender, a defense mechanism against targeted poisoning attacks in FL by leveraging differential testing. Our proposed method fingerprints the neuron activations of clients' models on the same input and uses differential testing to identify a potentially malicious client containing a backdoor. We evaluate FedDefender using MNIST and FashionMNIST datasets with 20 and 30 clients, and our results demonstrate that FedDefender effectively mitigates such attacks, reducing the attack success rate (ASR) to 10\% without deteriorating the global model performance.
