Have Your Cake and Eat It Too: Toward Efficient and Accurate Split Federated Learning

Dengke Yan; Ming Hu; Zeke Xia; Yanxin Yang; Jun Xia; Xiaofei Xie; Mingsong Chen

Have Your Cake and Eat It Too: Toward Efficient and Accurate Split Federated Learning

Dengke Yan, Ming Hu, Zeke Xia, Yanxin Yang, Jun Xia, Xiaofei Xie, Mingsong Chen

TL;DR

S^2FL tackles the twin challenges of stragglers and non-IID data in Split Federated Learning by introducing adaptive sliding model splitting and a data balance-based training mechanism. A three-part model (client-side, shared, server-side) is trained with device-aware partitions, feature-grouping by labels, and a novel aggregation method that respects per-device data weights. The authors provide convergence guarantees with an $O(1/t)$ rate under standard smoothness and convexity assumptions, and show through extensive experiments that S^2FL yields up to 16.5% accuracy gains and up to 3.54× training speedups over baselines across diverse AIoT datasets and models. This approach offers practical improvements for resource-constrained edge environments by aligning device workloads and homogenizing server-side training data distributions, thereby enabling efficient and accurate large-model federated learning.

Abstract

Due to its advantages in resource constraint scenarios, Split Federated Learning (SFL) is promising in AIoT systems. However, due to data heterogeneity and stragglers, SFL suffers from the challenges of low inference accuracy and low efficiency. To address these issues, this paper presents a novel SFL approach, named Sliding Split Federated Learning (S$^2$FL), which adopts an adaptive sliding model split strategy and a data balance-based training mechanism. By dynamically dispatching different model portions to AIoT devices according to their computing capability, S$^2$FL can alleviate the low training efficiency caused by stragglers. By combining features uploaded by devices with different data distributions to generate multiple larger batches with a uniform distribution for back-propagation, S$^2$FL can alleviate the performance degradation caused by data heterogeneity. Experimental results demonstrate that, compared to conventional SFL, S$^2$FL can achieve up to 16.5\% inference accuracy improvement and 3.54X training acceleration.

Have Your Cake and Eat It Too: Toward Efficient and Accurate Split Federated Learning

TL;DR

rate under standard smoothness and convexity assumptions, and show through extensive experiments that S^2FL yields up to 16.5% accuracy gains and up to 3.54× training speedups over baselines across diverse AIoT datasets and models. This approach offers practical improvements for resource-constrained edge environments by aligning device workloads and homogenizing server-side training data distributions, thereby enabling efficient and accurate large-model federated learning.

Abstract

FL), which adopts an adaptive sliding model split strategy and a data balance-based training mechanism. By dynamically dispatching different model portions to AIoT devices according to their computing capability, S

FL can alleviate the low training efficiency caused by stragglers. By combining features uploaded by devices with different data distributions to generate multiple larger batches with a uniform distribution for back-propagation, S

FL can alleviate the performance degradation caused by data heterogeneity. Experimental results demonstrate that, compared to conventional SFL, S

FL can achieve up to 16.5\% inference accuracy improvement and 3.54X training acceleration.

Paper Structure (17 sections, 4 theorems, 12 equations, 8 figures, 3 tables, 2 algorithms)

This paper contains 17 sections, 4 theorems, 12 equations, 8 figures, 3 tables, 2 algorithms.

Introduction
Background and Related Work
Our Approach
Adaptive Sliding Model Split Strategy
Data Balance-based Training Mechanism
Model Aggregation
Implementation of Our Approach
Convergence Analysis
Notations and Assumptions
Proofs of Key Lemmas
Proof of Theorem \ref{['theo_1']}
Experimental Results
Experimental Settings
Performance Comparison
Impacts of Different Configurations
...and 2 more sections

Key Result

Theorem 1

Assume that in $S^2FL$, the model aggregation is performed once after every $E$ SGD updates. Choose the $r = max\{8\frac{L}{\mu}, E\} - 1$ and the learning rate $\eta_t = \frac{2}{\mu(t + r)}$, We have where $\overline{w}_1$ is the initialized model,$w^{\star}$ is the optimal model, and $B = \sum\limits_{m=1}^{M}\sum\limits_{i=1}^{n_m}p_{m_i}^2\sigma_{m_i}^2 + 6L\Gamma + 8(E-1)^2G^2$.

Figures (8)

Figure 1: Framework and workflow of $S^2FL$.
Figure 2: Data balance-based training mechanism
Figure 3: Comparison of size and FLOPs of different model portions
Figure 4: Training process of $S^2FL$ and two baselines on CIFAR-10
Figure 5: Training processes with different numbers of devices
...and 3 more figures

Theorems & Definitions (4)

Theorem 1
Lemma 1: Results of one step SGD
Lemma 2: Bounding the variance
Lemma 3: Bounding the divergence of ${w_t^k}$

Have Your Cake and Eat It Too: Toward Efficient and Accurate Split Federated Learning

TL;DR

Abstract

Have Your Cake and Eat It Too: Toward Efficient and Accurate Split Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (4)