Table of Contents
Fetching ...

Mitigating Malicious Attacks in Federated Learning via Confidence-aware Defense

Qilei Li, Ahmed M. Abdelmoniem

TL;DR

The paper tackles robustness of Federated Learning to data and model poisoning by identifying malicious clients via confidence scores. It introduces Confidence-Aware Defense (CAD) that estimates per-client confidence, normalizes and clusters these scores to separate honest from malicious updates, and uses re-weighted aggregation to emphasize high-confidence contributors. CAD’s core insight is that attacks increase prediction uncertainty, enabling cross-attack-type detection under varying data heterogeneity (e.g., Dir$(\alpha)$ partitions) and non-IID conditions. Empirical results on CIFAR-10, MNIST, and Fashion-MNIST demonstrate that CAD outperforms baselines in accuracy and stability across attack intensities of 25%, 50%, and 75%, achieving high malicious-client detection rates and benefiting from re-weighted aggregation.

Abstract

Federated Learning (FL) is a distributed machine learning diagram that enables multiple clients to collaboratively train a global model without sharing their private local data. However, FL systems are vulnerable to attacks that are happening in malicious clients through data poisoning and model poisoning, which can deteriorate the performance of aggregated global model. Existing defense methods typically focus on mitigating specific types of poisoning and are often ineffective against unseen types of attack. These methods also assume an attack happened moderately while is not always holds true in real. Consequently, these methods can significantly fail in terms of accuracy and robustness when detecting and addressing updates from attacked malicious clients. To overcome these challenges, in this work, we propose a simple yet effective framework to detect malicious clients, namely Confidence-Aware Defense (CAD), that utilizes the confidence scores of local models as criteria to evaluate the reliability of local updates. Our key insight is that malicious attacks, regardless of attack type, will cause the model to deviate from its previous state, thus leading to increased uncertainty when making predictions. Therefore, CAD is comprehensively effective for both model poisoning and data poisoning attacks by accurately identifying and mitigating potential malicious updates, even under varying degrees of attacks and data heterogeneity. Experimental results demonstrate that our method significantly enhances the robustness of FL systems against various types of attacks across various scenarios by achieving higher model accuracy and stability.

Mitigating Malicious Attacks in Federated Learning via Confidence-aware Defense

TL;DR

The paper tackles robustness of Federated Learning to data and model poisoning by identifying malicious clients via confidence scores. It introduces Confidence-Aware Defense (CAD) that estimates per-client confidence, normalizes and clusters these scores to separate honest from malicious updates, and uses re-weighted aggregation to emphasize high-confidence contributors. CAD’s core insight is that attacks increase prediction uncertainty, enabling cross-attack-type detection under varying data heterogeneity (e.g., Dir partitions) and non-IID conditions. Empirical results on CIFAR-10, MNIST, and Fashion-MNIST demonstrate that CAD outperforms baselines in accuracy and stability across attack intensities of 25%, 50%, and 75%, achieving high malicious-client detection rates and benefiting from re-weighted aggregation.

Abstract

Federated Learning (FL) is a distributed machine learning diagram that enables multiple clients to collaboratively train a global model without sharing their private local data. However, FL systems are vulnerable to attacks that are happening in malicious clients through data poisoning and model poisoning, which can deteriorate the performance of aggregated global model. Existing defense methods typically focus on mitigating specific types of poisoning and are often ineffective against unseen types of attack. These methods also assume an attack happened moderately while is not always holds true in real. Consequently, these methods can significantly fail in terms of accuracy and robustness when detecting and addressing updates from attacked malicious clients. To overcome these challenges, in this work, we propose a simple yet effective framework to detect malicious clients, namely Confidence-Aware Defense (CAD), that utilizes the confidence scores of local models as criteria to evaluate the reliability of local updates. Our key insight is that malicious attacks, regardless of attack type, will cause the model to deviate from its previous state, thus leading to increased uncertainty when making predictions. Therefore, CAD is comprehensively effective for both model poisoning and data poisoning attacks by accurately identifying and mitigating potential malicious updates, even under varying degrees of attacks and data heterogeneity. Experimental results demonstrate that our method significantly enhances the robustness of FL systems against various types of attacks across various scenarios by achieving higher model accuracy and stability.
Paper Structure (26 sections, 13 equations, 5 figures, 3 tables)

This paper contains 26 sections, 13 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Impact of malicious attacks on model convergence under the "Little Is Enough" attack at various intensities. The FedAvg aggregation strategy degrades significantly, while our Confidence-Aware Defense consistently maintains high accuracy and robustness.
  • Figure 2: The proposed Confidence-Aware Defense (Confidence-Aware Defense) framework. It encompasses five key steps: 1. Initialize the global model and distribute it to clients. 2. Clients train their personalized, confidence-aware local models. 3. Clients upload their local models and associated confidence scores to the server. 4. The server identifies potentially malicious clients through confidence clustering. 5. The server aggregates the local models of honest clients by re-weighted aggregation.
  • Figure 3: Robustness to data heterogeneity by varying degrees of Non-IID data, as controlled by the parameter $\alpha$ in the Dirichlet distribution $\texttt{Dir}(\alpha)$.
  • Figure 4: Malicious client detection accuracy evaluated on the Cifar10 dataset with the VGG-11 model.
  • Figure 5: Effect of re-weighting on model accuracy across different degrees of attack.