Table of Contents
Fetching ...

FedPrompt: Communication-Efficient and Privacy Preserving Prompt Tuning in Federated Learning

Haodong Zhao, Wei Du, Fangqi Li, Peixuan Li, Gongshen Liu

TL;DR

FedPrompt integrates prompt-tuning into federated learning by freezing large PLMs and training only soft prompts under a split-aggregation framework, achieving substantial communication savings ($\approx$0.01\% of PLM parameters) with minimal accuracy loss on IID and Non-IID data. The method analyzes robustness to backdoor attacks by poisoning local data, and demonstrates that prompt-based updates, when aggregated across clients, can mitigate backdoor effects. Additional results show the benefits of local differential privacy at modest accuracy cost and provide guidance on hyperparameters (local iterations, prompt size, and prompt-method choice). The approach advances practical FL for NLP, enabling scalable, privacy-conscious deployment while highlighting security considerations and avenues for further strengthening robustness.

Abstract

Federated learning (FL) has enabled global model training on decentralized data in a privacy-preserving way by aggregating model updates. However, for many natural language processing (NLP) tasks that utilize pre-trained language models (PLMs) with large numbers of parameters, there are considerable communication costs associated with FL. Recently, prompt tuning, which tunes some soft prompts without modifying PLMs, has achieved excellent performance as a new learning paradigm. Therefore we want to combine the two methods and explore the effect of prompt tuning under FL. In this paper, we propose "FedPrompt" to study prompt tuning in a model split aggregation way using FL, and prove that split aggregation greatly reduces the communication cost, only 0.01% of the PLMs' parameters, with little decrease on accuracy both on IID and Non-IID data distribution. This improves the efficiency of FL method while also protecting the data privacy in prompt tuning. In addition, like PLMs, prompts are uploaded and downloaded between public platforms and personal users, so we try to figure out whether there is still a backdoor threat using only soft prompts in FL scenarios. We further conduct backdoor attacks by data poisoning on FedPrompt. Our experiments show that normal backdoor attack can not achieve a high attack success rate, proving the robustness of FedPrompt. We hope this work can promote the application of prompt in FL and raise the awareness of the possible security threats.

FedPrompt: Communication-Efficient and Privacy Preserving Prompt Tuning in Federated Learning

TL;DR

FedPrompt integrates prompt-tuning into federated learning by freezing large PLMs and training only soft prompts under a split-aggregation framework, achieving substantial communication savings (0.01\% of PLM parameters) with minimal accuracy loss on IID and Non-IID data. The method analyzes robustness to backdoor attacks by poisoning local data, and demonstrates that prompt-based updates, when aggregated across clients, can mitigate backdoor effects. Additional results show the benefits of local differential privacy at modest accuracy cost and provide guidance on hyperparameters (local iterations, prompt size, and prompt-method choice). The approach advances practical FL for NLP, enabling scalable, privacy-conscious deployment while highlighting security considerations and avenues for further strengthening robustness.

Abstract

Federated learning (FL) has enabled global model training on decentralized data in a privacy-preserving way by aggregating model updates. However, for many natural language processing (NLP) tasks that utilize pre-trained language models (PLMs) with large numbers of parameters, there are considerable communication costs associated with FL. Recently, prompt tuning, which tunes some soft prompts without modifying PLMs, has achieved excellent performance as a new learning paradigm. Therefore we want to combine the two methods and explore the effect of prompt tuning under FL. In this paper, we propose "FedPrompt" to study prompt tuning in a model split aggregation way using FL, and prove that split aggregation greatly reduces the communication cost, only 0.01% of the PLMs' parameters, with little decrease on accuracy both on IID and Non-IID data distribution. This improves the efficiency of FL method while also protecting the data privacy in prompt tuning. In addition, like PLMs, prompts are uploaded and downloaded between public platforms and personal users, so we try to figure out whether there is still a backdoor threat using only soft prompts in FL scenarios. We further conduct backdoor attacks by data poisoning on FedPrompt. Our experiments show that normal backdoor attack can not achieve a high attack success rate, proving the robustness of FedPrompt. We hope this work can promote the application of prompt in FL and raise the awareness of the possible security threats.
Paper Structure (20 sections, 8 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 20 sections, 8 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: The example of prompt tuning, which consists of soft prompt, text, PLM and verbalizer.
  • Figure 2: Structure of FedPrompt and full PLM fine-tuning using FL. The above one is full PLM fine-tuning using FL, all of the parameters (framed pink nodes) need to be updated. The bottom one is FedPrompt, only soft prompt parameters (framed pink nodes) need to be updated, aggregated (in server) and distributed.
  • Figure 3: The performance of prompt tuning without FL and FedPrompt. When using FL, there are IID setting and Non-IID setting on data distribution. The PLM used are BERT (the left), ROBERTA (the middle) and T5 (the right).
  • Figure 4: Local and global ACC (%) with communication rounds on SST-2 task using BERT. The left one is using IID setting and the right one is using Non-IID setting, the two clients are selected randomly.
  • Figure 5: Local and global ACC (%) and ASR (%) with communication rounds on SST-2 task using BERT. The results are using FedPPT with IID setting and only one fixed client in ten clients is malicious. The benign client is selected randomly.
  • ...and 1 more figures