Table of Contents
Fetching ...

Mingling with the Good to Backdoor Federated Learning

Nuno Neves

TL;DR

This paper addresses the vulnerability of federated learning to backdoor attacks by introducing MIGO, a generic strategy to craft malicious updates that gradually blend with benign updates to implant backdoors in the global model. It leverages three backdoor types (IN, EDGE, OUT) across five datasets and multiple architectures, supported by adaptive ESR/MPR constraints and layer-forcing to evade defenses. Empirical results show MIGO achieving backdoor accuracy exceeding 90% while preserving main-task utility, enduring across ten defenses and outperforming four alternative attack strategies, even when the attacker controls as little as 0.1% of clients. The findings highlight a substantial threat to FL deployments and underscore the need for defense mechanisms that can robustly distinguish subtly manipulated updates and protect the global model from persistent backdoor insertions.

Abstract

Federated learning (FL) is a decentralized machine learning technique that allows multiple entities to jointly train a model while preserving dataset privacy. However, its distributed nature has raised various security concerns, which have been addressed by increasingly sophisticated defenses. These protections utilize a range of data sources and metrics to, for example, filter out malicious model updates, ensuring that the impact of attacks is minimized or eliminated. This paper explores the feasibility of designing a generic attack method capable of installing backdoors in FL while evading a diverse array of defenses. Specifically, we focus on an attacker strategy called MIGO, which aims to produce model updates that subtly blend with legitimate ones. The resulting effect is a gradual integration of a backdoor into the global model, often ensuring its persistence long after the attack concludes, while generating enough ambiguity to hinder the effectiveness of defenses. MIGO was employed to implant three types of backdoors across five datasets and different model architectures. The results demonstrate the significant threat posed by these backdoors, as MIGO consistently achieved exceptionally high backdoor accuracy (exceeding 90%) while maintaining the utility of the main task. Moreover, MIGO exhibited strong evasion capabilities against ten defenses, including several state-of-the-art methods. When compared to four other attack strategies, MIGO consistently outperformed them across most configurations. Notably, even in extreme scenarios where the attacker controls just 0.1% of the clients, the results indicate that successful backdoor insertion is possible if the attacker can persist for a sufficient number of rounds.

Mingling with the Good to Backdoor Federated Learning

TL;DR

This paper addresses the vulnerability of federated learning to backdoor attacks by introducing MIGO, a generic strategy to craft malicious updates that gradually blend with benign updates to implant backdoors in the global model. It leverages three backdoor types (IN, EDGE, OUT) across five datasets and multiple architectures, supported by adaptive ESR/MPR constraints and layer-forcing to evade defenses. Empirical results show MIGO achieving backdoor accuracy exceeding 90% while preserving main-task utility, enduring across ten defenses and outperforming four alternative attack strategies, even when the attacker controls as little as 0.1% of clients. The findings highlight a substantial threat to FL deployments and underscore the need for defense mechanisms that can robustly distinguish subtly manipulated updates and protect the global model from persistent backdoor insertions.

Abstract

Federated learning (FL) is a decentralized machine learning technique that allows multiple entities to jointly train a model while preserving dataset privacy. However, its distributed nature has raised various security concerns, which have been addressed by increasingly sophisticated defenses. These protections utilize a range of data sources and metrics to, for example, filter out malicious model updates, ensuring that the impact of attacks is minimized or eliminated. This paper explores the feasibility of designing a generic attack method capable of installing backdoors in FL while evading a diverse array of defenses. Specifically, we focus on an attacker strategy called MIGO, which aims to produce model updates that subtly blend with legitimate ones. The resulting effect is a gradual integration of a backdoor into the global model, often ensuring its persistence long after the attack concludes, while generating enough ambiguity to hinder the effectiveness of defenses. MIGO was employed to implant three types of backdoors across five datasets and different model architectures. The results demonstrate the significant threat posed by these backdoors, as MIGO consistently achieved exceptionally high backdoor accuracy (exceeding 90%) while maintaining the utility of the main task. Moreover, MIGO exhibited strong evasion capabilities against ten defenses, including several state-of-the-art methods. When compared to four other attack strategies, MIGO consistently outperformed them across most configurations. Notably, even in extreme scenarios where the attacker controls just 0.1% of the clients, the results indicate that successful backdoor insertion is possible if the attacker can persist for a sufficient number of rounds.
Paper Structure (20 sections, 1 equation, 5 figures, 3 tables, 1 algorithm)

This paper contains 20 sections, 1 equation, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: The distance (L2 norm) between the global model at each round $r$ and the global model at an $init$ round. [CIFAR10 dataset; init=1800 round; 1 persistent attacker]
  • Figure 2: Representation on a 2-dimensional parameter space of a round of training with the MIGO attack strategy ($M$ is a malicious local model being trained with a few batches, and $B_i$ are the benign local models at the end of a round of training).
  • Figure 3: Global model accuracy with the (a) backdoor test dataset and (b) benign test dataset; (c) MIGO estimate of the distance between benign local models and the global model $G_r$, and the real observed max/min distances. [CIFAR10; 1 persistent attacker for 200 rounds; IN-backdoor]
  • Figure 4: (a) Main task accuracy for EDGE backdoor while training with the DIGIT dataset; (b) Three example OUT-backdoors with CIFAR10; (c) An attack lasting 1000 rounds, with a Random adversary controlling 0.1% of the participants (1 out 1000) with CIFAR10 and IN/EDGE/OUT backdoors;
  • Figure 5: (a) Cross-silo FL with 20 clients, where 4 are persistent attackers (EDGE/OUT backdoors with the CIFAR10 dataset). (b) and (c) Number of malicious and benign client updates accepted by Krum and FreqFed over groups of 10 rounds (IN-backdoor during the first 200 rounds of the attack).