Table of Contents
Fetching ...

A Client-level Assessment of Collaborative Backdoor Poisoning in Non-IID Federated Learning

Phung Lai, Guanxiong Liu, NhatHai Phan, Issa Khalil, Abdallah Khreishah, Xintao Wu

TL;DR

This work investigates backdoor vulnerabilities in Federated Learning when client data are non-IID and shows that existing defenses often fail to protect individual clients. It introduces CollaPois, a collaborative backdoor attack that trains a Trojaned model $X$ on auxiliary data and uses coordinated malicious gradients $\Delta \theta_c^t = \psi_c^t [X-\theta^t]$, with $\psi_c^t$ drawn from Uniform[$a,b$], to steer the global model toward $X$ while preserving accuracy on legitimate samples. The authors establish a theoretical link between data diversity, stealth, and the minimum number of compromised clients via $|C| \ge \frac{2-\sigma^2-\mu_\alpha^2}{a+b+2-\sigma^2-\mu_\alpha^2}\,|N|$, showing that more diverse local data reduces the needed adversary size and increases backdoor success. Empirically, CollaPois outperforms baseline attacks on Sentiment and FEMNIST, remains effective under robust federated training, and can backdoor a meaningful fraction of benign clients even with as little as 0.5–1% compromised participants. The findings underscore the importance of client-level risk assessment and highlight weaknesses in current defenses against non-IID, collaborative poisoning in FL, motivating new protective strategies.

Abstract

Federated learning (FL) enables collaborative model training using decentralized private data from multiple clients. While FL has shown robustness against poisoning attacks with basic defenses, our research reveals new vulnerabilities stemming from non-independent and identically distributed (non-IID) data among clients. These vulnerabilities pose a substantial risk of model poisoning in real-world FL scenarios. To demonstrate such vulnerabilities, we develop a novel collaborative backdoor poisoning attack called CollaPois. In this attack, we distribute a single pre-trained model infected with a Trojan to a group of compromised clients. These clients then work together to produce malicious gradients, causing the FL model to consistently converge towards a low-loss region centered around the Trojan-infected model. Consequently, the impact of the Trojan is amplified, especially when the benign clients have diverse local data distributions and scattered local gradients. CollaPois stands out by achieving its goals while involving only a limited number of compromised clients, setting it apart from existing attacks. Also, CollaPois effectively avoids noticeable shifts or degradation in the FL model's performance on legitimate data samples, allowing it to operate stealthily and evade detection by advanced robust FL algorithms. Thorough theoretical analysis and experiments conducted on various benchmark datasets demonstrate the superiority of CollaPois compared to state-of-the-art backdoor attacks. Notably, CollaPois bypasses existing backdoor defenses, especially in scenarios where clients possess diverse data distributions. Moreover, the results show that CollaPois remains effective even when involving a small number of compromised clients. Notably, clients whose local data is closely aligned with compromised clients experience higher risks of backdoor infections.

A Client-level Assessment of Collaborative Backdoor Poisoning in Non-IID Federated Learning

TL;DR

This work investigates backdoor vulnerabilities in Federated Learning when client data are non-IID and shows that existing defenses often fail to protect individual clients. It introduces CollaPois, a collaborative backdoor attack that trains a Trojaned model on auxiliary data and uses coordinated malicious gradients , with drawn from Uniform[], to steer the global model toward while preserving accuracy on legitimate samples. The authors establish a theoretical link between data diversity, stealth, and the minimum number of compromised clients via , showing that more diverse local data reduces the needed adversary size and increases backdoor success. Empirically, CollaPois outperforms baseline attacks on Sentiment and FEMNIST, remains effective under robust federated training, and can backdoor a meaningful fraction of benign clients even with as little as 0.5–1% compromised participants. The findings underscore the importance of client-level risk assessment and highlight weaknesses in current defenses against non-IID, collaborative poisoning in FL, motivating new protective strategies.

Abstract

Federated learning (FL) enables collaborative model training using decentralized private data from multiple clients. While FL has shown robustness against poisoning attacks with basic defenses, our research reveals new vulnerabilities stemming from non-independent and identically distributed (non-IID) data among clients. These vulnerabilities pose a substantial risk of model poisoning in real-world FL scenarios. To demonstrate such vulnerabilities, we develop a novel collaborative backdoor poisoning attack called CollaPois. In this attack, we distribute a single pre-trained model infected with a Trojan to a group of compromised clients. These clients then work together to produce malicious gradients, causing the FL model to consistently converge towards a low-loss region centered around the Trojan-infected model. Consequently, the impact of the Trojan is amplified, especially when the benign clients have diverse local data distributions and scattered local gradients. CollaPois stands out by achieving its goals while involving only a limited number of compromised clients, setting it apart from existing attacks. Also, CollaPois effectively avoids noticeable shifts or degradation in the FL model's performance on legitimate data samples, allowing it to operate stealthily and evade detection by advanced robust FL algorithms. Thorough theoretical analysis and experiments conducted on various benchmark datasets demonstrate the superiority of CollaPois compared to state-of-the-art backdoor attacks. Notably, CollaPois bypasses existing backdoor defenses, especially in scenarios where clients possess diverse data distributions. Moreover, the results show that CollaPois remains effective even when involving a small number of compromised clients. Notably, clients whose local data is closely aligned with compromised clients experience higher risks of backdoor infections.

Paper Structure

This paper contains 21 sections, 3 theorems, 22 equations, 25 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

The minimum number of compromised clients needed to carry out backdoor poisoning successfully in the worst-case scenario is given by the following formula: where $\beta_i$ is the angle between the gradients of an arbitrary benign client $i$ and that of the aggregated malicious gradients of all the compromised clients. We assume that $\beta_i$ follows a normal distribution, i.e., $\beta_i \sim \ma

Figures (25)

  • Figure 1: DPois and MRepl attacks show modest changes, with $0.1\%$ and $1\%$ compromised clients across distribution levels.
  • Figure 2: CollaPois framework. In each training round, compromised clients send malicious gradients (red-solid) to steer the FL model $\theta$ toward a Trojaned model $X$ sent by the attacker (red-dashed). Dashed and solid vectors indicate one-time and multiple training rounds, respectively.
  • Figure 3: Average angles among gradients from benign and compromised clients as a function of $\alpha$ in the FEMNIST dataset. Model and data configuration are in Section \ref{['Experimental Results']}.
  • Figure 4: Approximation error for the lower bound of $| \mathcal{C}|$ in Theorem \ref{['Theorem-angles']} as a function of $\alpha$ using FEMNIST dataset.
  • Figure 5: 3D plot of $|\mathcal{C}| / |N|$ as a function of $\mu_\alpha$ and $\sigma$.
  • ...and 20 more figures

Theorems & Definitions (6)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • proof
  • proof
  • proof