SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large Pre-Trained Models over Resource-Limited Devices

Linxiao Cao; Yifei Zhu; Wei Gong

SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large Pre-Trained Models over Resource-Limited Devices

Linxiao Cao, Yifei Zhu, Wei Gong

TL;DR

SFPrompt tackles the problem of privately fine-tuning large pre-trained models on resource-limited devices by combining split learning with federated learning. It partitions the model into head $W_h$, body $W_b$, and tail $W_t$, placing $W_C=[W_h,W_t]$ on clients and $W_S=W_b$ on the server, and introduces learnable prompts $p$ alongside a three-phase training process that includes local-loss updates and dataset pruning. The key contributions are the introduction of a dynamic client-server partitioning strategy, prompt-based fine-tuning, and an EL2N-driven data pruning mechanism that collectively reduce communication and computation while preserving privacy. Empirical results on ViT-based vision tasks show SFPrompt achieves competitive accuracy with substantially lower local FLOPs (about 0.46% of FL) and much less communication (roughly 0.47× for ViT-base and 0.19× for ViT-Large) compared with full federated fine-tuning and other baselines, highlighting its practicality for privacy-constrained, resource-limited deployments.

Abstract

Large pre-trained models have exhibited remarkable achievements across various domains. The substantial training costs associated with these models have led to wide studies of fine-tuning for effectively harnessing their capabilities in solving downstream tasks. Yet, conventional fine-tuning approaches become infeasible when the model lacks access to downstream data due to privacy concerns. Naively integrating fine-tuning approaches with the emerging federated learning frameworks incurs substantial communication overhead and exerts high demand on local computing resources, making it impractical for common resource-limited devices. In this paper, we introduce SFPrompt, an innovative privacy-preserving fine-tuning method tailored for the federated setting where direct uploading of raw data is prohibited and local devices are resource-constrained to run a complete pre-trained model. In essence, SFPrompt judiciously combines split learning with federated learning to handle these challenges. Specifically, the pre-trained model is first partitioned into client and server components, thereby streamlining the client-side model and substantially alleviating computational demands on local resources. SFPrompt then introduces soft prompts into the federated model to enhance the fine-tuning performance. To further reduce communication costs, a novel dataset pruning algorithm and a local-loss update strategy are devised during the fine-tuning process. Extensive experiments demonstrate that SFPrompt delivers competitive performance as the federated full fine-tuning approach while consuming a mere 0.46% of local computing resources and incurring 53% less communication cost.

SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large Pre-Trained Models over Resource-Limited Devices

TL;DR

SFPrompt tackles the problem of privately fine-tuning large pre-trained models on resource-limited devices by combining split learning with federated learning. It partitions the model into head

, body

, and tail

, placing

on clients and

on the server, and introduces learnable prompts

alongside a three-phase training process that includes local-loss updates and dataset pruning. The key contributions are the introduction of a dynamic client-server partitioning strategy, prompt-based fine-tuning, and an EL2N-driven data pruning mechanism that collectively reduce communication and computation while preserving privacy. Empirical results on ViT-based vision tasks show SFPrompt achieves competitive accuracy with substantially lower local FLOPs (about 0.46% of FL) and much less communication (roughly 0.47× for ViT-base and 0.19× for ViT-Large) compared with full federated fine-tuning and other baselines, highlighting its practicality for privacy-constrained, resource-limited deployments.

Abstract

Paper Structure (15 sections, 3 equations, 7 figures, 3 tables, 2 algorithms)

This paper contains 15 sections, 3 equations, 7 figures, 3 tables, 2 algorithms.

Introduction
Background and Related Work
Federated Fine-tuning Pre-trained Models
Split Federated Learning
Methodology
Framework Overview
Phase 1: Client Self-Update
Phase 2: Split Training
Phase 3: Parameters Aggregation
Analysis of SFPrompt
Experiments
Experimental Setup
Evaluation of SFPrompt
Ablation Study
Conclusions

Figures (7)

Figure 1: Limitations of current fine-tuning approaches in privacy-preserving and resource-limited environments
Figure 2: (a) communication cost comparison in a global round between FL and SFL; (b) the relationship between the communication rounds and communication cost.
Figure 3: The overview of SFPrompt. SFPrompt is a three-phase process: Phase 1 minimizes communication cost by implementing local-loss updates and dataset pruning; Phase 2 splits the foundation model to reduce local computational burdens by remaining the lightweight part on client; Phase 3 aggregates the tail model and prompt parameters to refresh the global model.
Figure 4: Comparison of the accuracy among three methods.
Figure 5: The accuracy of SFPrompt with various prompt lengths and tuned parameters on CIFAR-100
...and 2 more figures

SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large Pre-Trained Models over Resource-Limited Devices

TL;DR

Abstract

SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large Pre-Trained Models over Resource-Limited Devices

Authors

TL;DR

Abstract

Table of Contents

Figures (7)