Table of Contents
Fetching ...

Two Heads Are Better than One: Model-Weight and Latent-Space Analysis for Federated Learning on Non-iid Data against Poisoning Attacks

Xingyu Lyu, Ning Wang, Yang Xiao, Shixiong Li, Tao Li, Danjue Chen, Yimin Chen

TL;DR

This paper addresses the vulnerability of Federated Learning to model poisoning attacks under non-iid data distributions. It proposes GeminiGuard, a lightweight, unsupervised defense that combines model-weight analysis and latent-space analysis to filter and score client updates before aggregation, using adaptive clustering and MMD-based trust scoring. The approach is evaluated across five non-iid settings, four datasets, and multiple untargeted and backdoor attacks, showing superior robustness compared to nine state-of-the-art defenses, including resilience to adaptive threats. The work demonstrates practical viability for deploying robust FL defenses in realistic non-iid environments, supported by ablation studies and computation-time analysis. Overall, GeminiGuard advances secure FL by offering a versatile, unsupervised framework that effectively mitigates diverse MPAs with modest overhead.

Abstract

Federated Learning is a popular paradigm that enables remote clients to jointly train a global model without sharing their raw data. However, FL has been shown to be vulnerable towards model poisoning attacks due to its distributed nature. Particularly, attackers acting as participants can upload arbitrary model updates that effectively compromise the global model of FL. While extensive research has been focusing on fighting against these attacks, we find that most of them assume data at remote clients are under iid while in practice they are inevitably non-iid. Our benchmark evaluations reveal that existing defenses generally fail to live up to their reputation when applied to various non-iid scenarios. In this paper, we propose a novel approach, GeminiGuard, that aims to address such a significant gap. We design GeminiGuard to be lightweight, versatile, and unsupervised so that it aligns well with the practical requirements of deploying such defenses. The key challenge from non-iids is that they make benign model updates look more similar to malicious ones. GeminiGuard is mainly built on two fundamental observations: (1) existing defenses based on either model-weight analysis or latent-space analysis face limitations in covering different MPAs and non-iid scenarios, and (2) model-weight and latent-space analysis are sufficiently different yet potentially complementary methods as MPA defenses. We hence incorporate a novel model-weight analysis component as well as a custom latent-space analysis component in GeminiGuard, aiming to further enhance its defense performance. We conduct extensive experiments to evaluate our defense across various settings, demonstrating its effectiveness in countering multiple types of untargeted and targeted MPAs, including adaptive ones. Our comprehensive evaluations show that GeminiGuard consistently outperforms SOTA defenses under various settings.

Two Heads Are Better than One: Model-Weight and Latent-Space Analysis for Federated Learning on Non-iid Data against Poisoning Attacks

TL;DR

This paper addresses the vulnerability of Federated Learning to model poisoning attacks under non-iid data distributions. It proposes GeminiGuard, a lightweight, unsupervised defense that combines model-weight analysis and latent-space analysis to filter and score client updates before aggregation, using adaptive clustering and MMD-based trust scoring. The approach is evaluated across five non-iid settings, four datasets, and multiple untargeted and backdoor attacks, showing superior robustness compared to nine state-of-the-art defenses, including resilience to adaptive threats. The work demonstrates practical viability for deploying robust FL defenses in realistic non-iid environments, supported by ablation studies and computation-time analysis. Overall, GeminiGuard advances secure FL by offering a versatile, unsupervised framework that effectively mitigates diverse MPAs with modest overhead.

Abstract

Federated Learning is a popular paradigm that enables remote clients to jointly train a global model without sharing their raw data. However, FL has been shown to be vulnerable towards model poisoning attacks due to its distributed nature. Particularly, attackers acting as participants can upload arbitrary model updates that effectively compromise the global model of FL. While extensive research has been focusing on fighting against these attacks, we find that most of them assume data at remote clients are under iid while in practice they are inevitably non-iid. Our benchmark evaluations reveal that existing defenses generally fail to live up to their reputation when applied to various non-iid scenarios. In this paper, we propose a novel approach, GeminiGuard, that aims to address such a significant gap. We design GeminiGuard to be lightweight, versatile, and unsupervised so that it aligns well with the practical requirements of deploying such defenses. The key challenge from non-iids is that they make benign model updates look more similar to malicious ones. GeminiGuard is mainly built on two fundamental observations: (1) existing defenses based on either model-weight analysis or latent-space analysis face limitations in covering different MPAs and non-iid scenarios, and (2) model-weight and latent-space analysis are sufficiently different yet potentially complementary methods as MPA defenses. We hence incorporate a novel model-weight analysis component as well as a custom latent-space analysis component in GeminiGuard, aiming to further enhance its defense performance. We conduct extensive experiments to evaluate our defense across various settings, demonstrating its effectiveness in countering multiple types of untargeted and targeted MPAs, including adaptive ones. Our comprehensive evaluations show that GeminiGuard consistently outperforms SOTA defenses under various settings.

Paper Structure

This paper contains 25 sections, 4 equations, 9 figures, 9 tables, 1 algorithm.

Figures (9)

  • Figure 1: Illustrations of iid and five non-iids (dir-based, prob-based, qty-based, noise, qs) for $u_1, u_2$, and $u_3$, and 10 label classes.
  • Figure 2: Workflow of FL with GeminiGuard.
  • Figure 3: Model updates in t-SNE (BadNets, CIFAR-10) under different scenarios: iid, non-iid (qty), and with GeminiGuard(GG).
  • Figure 4: Distance of layers between Benign-to-Benign (b2b), Benign-to-Malicious (b2m), and Malicious-to-Malicious (m2m).
  • Figure 5: Comparison of non-iid impacts on various FL defenses for CIFAR-10, Min-Max Attack shejwalkar2021manipulating.
  • ...and 4 more figures