Table of Contents
Fetching ...

Foundation Models in Federated Learning: Assessing Backdoor Vulnerabilities

Xi Li, Chen Wu, Jiaqi Wang

TL;DR

This work investigates backdoor vulnerabilities arising when foundation models are integrated into federated learning (FM-FL). It introduces two novel attack vectors: external one-time poisoning via in-context learning and backdoors that propagate through FM-FL interaction via prototype initialization and ensemble distillation, enabling misclassification of triggered inputs without persistent attacker involvement. Extensive experiments on image (CIFAR-10/100) and text (AG-NEWS) datasets demonstrate that FM-FL is highly susceptible to BD-FMFL, with attack success rates approaching or exceeding 80–90% in many settings, while existing FL defenses offer limited protection. The findings stress the urgent need for security mechanisms tailored to FM-FL to guard against server- and FM-originated backdoor threats in large-scale, heterogeneous FL deployments.

Abstract

Federated Learning (FL), a privacy-preserving machine learning framework, faces significant data-related challenges. For example, the lack of suitable public datasets leads to ineffective information exchange, especially in heterogeneous environments with uneven data distribution. Foundation Models (FMs) offer a promising solution by generating synthetic datasets that mimic client data distributions, aiding model initialization and knowledge sharing among clients. However, the interaction between FMs and FL introduces new attack vectors that remain largely unexplored. This work therefore assesses the backdoor vulnerabilities exploiting FMs, where attackers exploit safety issues in FMs and poison synthetic datasets to compromise the entire system. Unlike traditional attacks, these new threats are characterized by their one-time, external nature, requiring minimal involvement in FL training. Given these uniqueness, current FL defense strategies provide limited robustness against this novel attack approach. Extensive experiments across image and text domains reveal the high susceptibility of FL to these novel threats, emphasizing the urgent need for enhanced security measures in FL in the era of FMs.

Foundation Models in Federated Learning: Assessing Backdoor Vulnerabilities

TL;DR

This work investigates backdoor vulnerabilities arising when foundation models are integrated into federated learning (FM-FL). It introduces two novel attack vectors: external one-time poisoning via in-context learning and backdoors that propagate through FM-FL interaction via prototype initialization and ensemble distillation, enabling misclassification of triggered inputs without persistent attacker involvement. Extensive experiments on image (CIFAR-10/100) and text (AG-NEWS) datasets demonstrate that FM-FL is highly susceptible to BD-FMFL, with attack success rates approaching or exceeding 80–90% in many settings, while existing FL defenses offer limited protection. The findings stress the urgent need for security mechanisms tailored to FM-FL to guard against server- and FM-originated backdoor threats in large-scale, heterogeneous FL deployments.

Abstract

Federated Learning (FL), a privacy-preserving machine learning framework, faces significant data-related challenges. For example, the lack of suitable public datasets leads to ineffective information exchange, especially in heterogeneous environments with uneven data distribution. Foundation Models (FMs) offer a promising solution by generating synthetic datasets that mimic client data distributions, aiding model initialization and knowledge sharing among clients. However, the interaction between FMs and FL introduces new attack vectors that remain largely unexplored. This work therefore assesses the backdoor vulnerabilities exploiting FMs, where attackers exploit safety issues in FMs and poison synthetic datasets to compromise the entire system. Unlike traditional attacks, these new threats are characterized by their one-time, external nature, requiring minimal involvement in FL training. Given these uniqueness, current FL defense strategies provide limited robustness against this novel attack approach. Extensive experiments across image and text domains reveal the high susceptibility of FL to these novel threats, emphasizing the urgent need for enhanced security measures in FL in the era of FMs.
Paper Structure (27 sections, 4 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 27 sections, 4 equations, 3 figures, 5 tables, 1 algorithm.

Figures (3)

  • Figure 1: The novel backdoor attack strategy targets FM-FL. Red arrows indicate steps affected by the compromised FM.
  • Figure 2: Ablation study in cross-silo FL using the IID CIFAR-10. AS-1/3: Utilizes poisoned synthetic data exclusively in ensemble distillation. AS-2/4: Utilizes poisoned synthetic data exclusively in model initialization.
  • Figure 3: Hyper-parameter study in cross-silo homo-FL scenarios. (a)(b) use the IID CIFAR-10 dataset, (c) uses the non-IID CIFAR-100 dataset, (d)(e) use the IID CIFAR-10 dataset. LDI refers to the ratio between the number of iterations of (client) local training and that of (server) knowledge distillation.