Defending against Data Poisoning Attacks in Federated Learning via User Elimination

Nick Galanis

Defending against Data Poisoning Attacks in Federated Learning via User Elimination

Nick Galanis

TL;DR

This work addresses data poisoning in Federated Learning by proposing a privacy-preserving defense that eliminates adversarial clients during aggregation. The key idea is to have clients report their local training loss with Local Differential Privacy, and to use loss-based anomaly detection—especially a K-Means clustering approach—to identify and ban malicious participants. Across MNIST and CIFAR-10, the defense maintains model utility, preserves privacy, and achieves high attacker-detection metrics, even when up to 40% of clients are malicious. The approach demonstrates a practical balance between privacy and utility, enabling safer deployment of FL in privacy-sensitive domains and inviting future refinements in threat modeling and defense strategies.

Abstract

In the evolving landscape of Federated Learning (FL), a new type of attacks concerns the research community, namely Data Poisoning Attacks, which threaten the model integrity by maliciously altering training data. This paper introduces a novel defensive framework focused on the strategic elimination of adversarial users within a federated model. We detect those anomalies in the aggregation phase of the Federated Algorithm, by integrating metadata gathered by the local training instances with Differential Privacy techniques, to ensure that no data leakage is possible. To our knowledge, this is the first proposal in the field of FL that leverages metadata other than the model's gradients in order to ensure honesty in the reported local models. Our extensive experiments demonstrate the efficacy of our methods, significantly mitigating the risk of data poisoning while maintaining user privacy and model performance. Our findings suggest that this new user elimination approach serves us with a great balance between privacy and utility, thus contributing to the arsenal of arguments in favor of the safe adoption of FL in safe domains, both in academic setting and in the industry.

Defending against Data Poisoning Attacks in Federated Learning via User Elimination

TL;DR

Abstract

Paper Structure (41 sections, 1 theorem, 3 equations, 14 figures, 1 table)

This paper contains 41 sections, 1 theorem, 3 equations, 14 figures, 1 table.

Introduction
Problem introduction.
Motivation and contributions.
Preliminaries and Relevant Work
Federated Learning
Differential Privacy
Poisoning Attacks
Defending against Data Poisoning Attacks
Poisoning Attacks against Federated Learning
Metrics used
Experiments results
Impact in standard metrics.
Impact in Source Class Recall.
Novel Algorithm for Defending against Poisoning Attacks
Threat Model
...and 26 more sections

Key Result

THEOREM 1

Differential Privacy, given in dwork_algorithmic_2014 A randomized algorithm $M$ is $(\epsilon, \delta)$-differentially private, if for all $D_1$ and $D_2$, that differ on at most a single element, and $S\subseteq Range(M)$, stands that:

Figures (14)

Figure 1: Sparse Categorical Accuracy over the different percentages of malicious users present for MNIST (top) and CIFAR (bottom) datasets.
Figure 2: Comparison of the accuracy curve for an honestly and a maliciously trained model
Figure 3: Crossentropy Loss over the different percentages of malicious users present for MNIST (top) and CIFAR (bottom) datasets.
Figure 4: Source Class Recall over the different percentages of malicious users present for MNIST (top) and CIFAR (bottom) datasets.
Figure 5: Accuracy over the different percentages of malicious users present for MNIST (top) and CIFAR (bottom) datasets.
...and 9 more figures

Theorems & Definitions (1)

THEOREM 1

Defending against Data Poisoning Attacks in Federated Learning via User Elimination

TL;DR

Abstract

Defending against Data Poisoning Attacks in Federated Learning via User Elimination

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (1)