SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large Pre-Trained Models over Resource-Limited Devices
Linxiao Cao, Yifei Zhu, Wei Gong
TL;DR
SFPrompt tackles the problem of privately fine-tuning large pre-trained models on resource-limited devices by combining split learning with federated learning. It partitions the model into head $W_h$, body $W_b$, and tail $W_t$, placing $W_C=[W_h,W_t]$ on clients and $W_S=W_b$ on the server, and introduces learnable prompts $p$ alongside a three-phase training process that includes local-loss updates and dataset pruning. The key contributions are the introduction of a dynamic client-server partitioning strategy, prompt-based fine-tuning, and an EL2N-driven data pruning mechanism that collectively reduce communication and computation while preserving privacy. Empirical results on ViT-based vision tasks show SFPrompt achieves competitive accuracy with substantially lower local FLOPs (about 0.46% of FL) and much less communication (roughly 0.47× for ViT-base and 0.19× for ViT-Large) compared with full federated fine-tuning and other baselines, highlighting its practicality for privacy-constrained, resource-limited deployments.
Abstract
Large pre-trained models have exhibited remarkable achievements across various domains. The substantial training costs associated with these models have led to wide studies of fine-tuning for effectively harnessing their capabilities in solving downstream tasks. Yet, conventional fine-tuning approaches become infeasible when the model lacks access to downstream data due to privacy concerns. Naively integrating fine-tuning approaches with the emerging federated learning frameworks incurs substantial communication overhead and exerts high demand on local computing resources, making it impractical for common resource-limited devices. In this paper, we introduce SFPrompt, an innovative privacy-preserving fine-tuning method tailored for the federated setting where direct uploading of raw data is prohibited and local devices are resource-constrained to run a complete pre-trained model. In essence, SFPrompt judiciously combines split learning with federated learning to handle these challenges. Specifically, the pre-trained model is first partitioned into client and server components, thereby streamlining the client-side model and substantially alleviating computational demands on local resources. SFPrompt then introduces soft prompts into the federated model to enhance the fine-tuning performance. To further reduce communication costs, a novel dataset pruning algorithm and a local-loss update strategy are devised during the fine-tuning process. Extensive experiments demonstrate that SFPrompt delivers competitive performance as the federated full fine-tuning approach while consuming a mere 0.46% of local computing resources and incurring 53% less communication cost.
