Table of Contents
Fetching ...

SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large Pre-Trained Models over Resource-Limited Devices

Linxiao Cao, Yifei Zhu, Wei Gong

TL;DR

SFPrompt tackles the problem of privately fine-tuning large pre-trained models on resource-limited devices by combining split learning with federated learning. It partitions the model into head $W_h$, body $W_b$, and tail $W_t$, placing $W_C=[W_h,W_t]$ on clients and $W_S=W_b$ on the server, and introduces learnable prompts $p$ alongside a three-phase training process that includes local-loss updates and dataset pruning. The key contributions are the introduction of a dynamic client-server partitioning strategy, prompt-based fine-tuning, and an EL2N-driven data pruning mechanism that collectively reduce communication and computation while preserving privacy. Empirical results on ViT-based vision tasks show SFPrompt achieves competitive accuracy with substantially lower local FLOPs (about 0.46% of FL) and much less communication (roughly 0.47× for ViT-base and 0.19× for ViT-Large) compared with full federated fine-tuning and other baselines, highlighting its practicality for privacy-constrained, resource-limited deployments.

Abstract

Large pre-trained models have exhibited remarkable achievements across various domains. The substantial training costs associated with these models have led to wide studies of fine-tuning for effectively harnessing their capabilities in solving downstream tasks. Yet, conventional fine-tuning approaches become infeasible when the model lacks access to downstream data due to privacy concerns. Naively integrating fine-tuning approaches with the emerging federated learning frameworks incurs substantial communication overhead and exerts high demand on local computing resources, making it impractical for common resource-limited devices. In this paper, we introduce SFPrompt, an innovative privacy-preserving fine-tuning method tailored for the federated setting where direct uploading of raw data is prohibited and local devices are resource-constrained to run a complete pre-trained model. In essence, SFPrompt judiciously combines split learning with federated learning to handle these challenges. Specifically, the pre-trained model is first partitioned into client and server components, thereby streamlining the client-side model and substantially alleviating computational demands on local resources. SFPrompt then introduces soft prompts into the federated model to enhance the fine-tuning performance. To further reduce communication costs, a novel dataset pruning algorithm and a local-loss update strategy are devised during the fine-tuning process. Extensive experiments demonstrate that SFPrompt delivers competitive performance as the federated full fine-tuning approach while consuming a mere 0.46% of local computing resources and incurring 53% less communication cost.

SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large Pre-Trained Models over Resource-Limited Devices

TL;DR

SFPrompt tackles the problem of privately fine-tuning large pre-trained models on resource-limited devices by combining split learning with federated learning. It partitions the model into head , body , and tail , placing on clients and on the server, and introduces learnable prompts alongside a three-phase training process that includes local-loss updates and dataset pruning. The key contributions are the introduction of a dynamic client-server partitioning strategy, prompt-based fine-tuning, and an EL2N-driven data pruning mechanism that collectively reduce communication and computation while preserving privacy. Empirical results on ViT-based vision tasks show SFPrompt achieves competitive accuracy with substantially lower local FLOPs (about 0.46% of FL) and much less communication (roughly 0.47× for ViT-base and 0.19× for ViT-Large) compared with full federated fine-tuning and other baselines, highlighting its practicality for privacy-constrained, resource-limited deployments.

Abstract

Large pre-trained models have exhibited remarkable achievements across various domains. The substantial training costs associated with these models have led to wide studies of fine-tuning for effectively harnessing their capabilities in solving downstream tasks. Yet, conventional fine-tuning approaches become infeasible when the model lacks access to downstream data due to privacy concerns. Naively integrating fine-tuning approaches with the emerging federated learning frameworks incurs substantial communication overhead and exerts high demand on local computing resources, making it impractical for common resource-limited devices. In this paper, we introduce SFPrompt, an innovative privacy-preserving fine-tuning method tailored for the federated setting where direct uploading of raw data is prohibited and local devices are resource-constrained to run a complete pre-trained model. In essence, SFPrompt judiciously combines split learning with federated learning to handle these challenges. Specifically, the pre-trained model is first partitioned into client and server components, thereby streamlining the client-side model and substantially alleviating computational demands on local resources. SFPrompt then introduces soft prompts into the federated model to enhance the fine-tuning performance. To further reduce communication costs, a novel dataset pruning algorithm and a local-loss update strategy are devised during the fine-tuning process. Extensive experiments demonstrate that SFPrompt delivers competitive performance as the federated full fine-tuning approach while consuming a mere 0.46% of local computing resources and incurring 53% less communication cost.
Paper Structure (15 sections, 3 equations, 7 figures, 3 tables, 2 algorithms)

This paper contains 15 sections, 3 equations, 7 figures, 3 tables, 2 algorithms.

Figures (7)

  • Figure 1: Limitations of current fine-tuning approaches in privacy-preserving and resource-limited environments
  • Figure 2: (a) communication cost comparison in a global round between FL and SFL; (b) the relationship between the communication rounds and communication cost.
  • Figure 3: The overview of SFPrompt. SFPrompt is a three-phase process: Phase 1 minimizes communication cost by implementing local-loss updates and dataset pruning; Phase 2 splits the foundation model to reduce local computational burdens by remaining the lightweight part on client; Phase 3 aggregates the tail model and prompt parameters to refresh the global model.
  • Figure 4: Comparison of the accuracy among three methods.
  • Figure 5: The accuracy of SFPrompt with various prompt lengths and tuned parameters on CIFAR-100
  • ...and 2 more figures