Infighting in the Dark: Multi-Label Backdoor Attack in Federated Learning
Ye Li, Yanchao Zhao, Chengcheng Zhu, Jiale Zhang
TL;DR
The paper addresses backdoor vulnerabilities in Federated Learning by introducing Multi-Label Backdoor Attack (MBA) under non-cooperative adversaries. It reveals that conventional SBA/MBA attacks suffer from infighting due to overlapping out-of-distribution backdoor mappings, limiting multiple attackers' ability to coexist. To overcome this, the authors propose Mirage, which constructs in-distribution (ID) mappings between backdoor features and target distributions via adversarial trigger adaptation and a constrained optimization to ensure persistence amid global training. Empirical results demonstrate that Mirage achieves ASR consistently above 97% and sustains over 90% after 900 rounds across multiple datasets and models, outperforming state-of-the-art attacks and challenging existing defenses. The work highlights a practical threat in large-scale FL and motivates the development of MBA-aware defenses, with code made open-source for reproducibility.
Abstract
Federated Learning (FL), a privacy-preserving decentralized machine learning framework, has been shown to be vulnerable to backdoor attacks. Current research primarily focuses on the Single-Label Backdoor Attack (SBA), wherein adversaries share a consistent target. However, a critical fact is overlooked: adversaries may be non-cooperative, have distinct targets, and operate independently, which exhibits a more practical scenario called Multi-Label Backdoor Attack (MBA). Unfortunately, prior works are ineffective in the MBA scenario since non-cooperative attackers exclude each other. In this work, we conduct an in-depth investigation to uncover the inherent constraints of the exclusion: similar backdoor mappings are constructed for different targets, resulting in conflicts among backdoor functions. To address this limitation, we propose Mirage, the first non-cooperative MBA strategy in FL that allows attackers to inject effective and persistent backdoors into the global model without collusion by constructing in-distribution (ID) backdoor mapping. Specifically, we introduce an adversarial adaptation method to bridge the backdoor features and the target distribution in an ID manner. Additionally, we further leverage a constrained optimization method to ensure the ID mapping survives in the global training dynamics. Extensive evaluations demonstrate that Mirage outperforms various state-of-the-art attacks and bypasses existing defenses, achieving an average ASR greater than 97\% and maintaining over 90\% after 900 rounds. This work aims to alert researchers to this potential threat and inspire the design of effective defense mechanisms. Code has been made open-source.
