MergeSFL: Split Federated Learning with Feature Merging and Batch Size Regulation
Yunming Liao, Yang Xu, Hongli Xu, Lun Wang, Zhiwei Yao, Chunming Qiao
TL;DR
MergeSFL tackles the dual challenges of statistical and system heterogeneity in split federated learning by jointly applying feature merging and batch size regulation. By merging features from multiple workers into a mixed sequence and dynamically tailoring per-worker batch sizes, MergeSFL aligns the learning direction with IID-like data and reduces idle waiting, achieving faster convergence and lower communication. The framework uses a control module that estimates worker states, selects participants with a genetic algorithm to minimize KL divergence between merged and IID distributions, and optimizes batch sizes under bandwidth constraints, while a training module handles bottom-model updates, feature merging, gradient dispatch, and adaptive bottom-model aggregation. Empirical results on 80 Jetson devices across HAR, Speech, CIFAR-10, and IMAGE-100 show substantial improvements in accuracy (up to 26.22 percentage points) and speedups (up to 4.14x) versus strong baselines, confirming the practical value of jointly optimizing FM and BR in heterogeneous edge deployments.
Abstract
Recently, federated learning (FL) has emerged as a popular technique for edge AI to mine valuable knowledge in edge computing (EC) systems. To mitigate the computing/communication burden on resource-constrained workers and protect model privacy, split federated learning (SFL) has been released by integrating both data and model parallelism. Despite resource limitations, SFL still faces two other critical challenges in EC, i.e., statistical heterogeneity and system heterogeneity. To address these challenges, we propose a novel SFL framework, termed MergeSFL, by incorporating feature merging and batch size regulation in SFL. Concretely, feature merging aims to merge the features from workers into a mixed feature sequence, which is approximately equivalent to the features derived from IID data and is employed to promote model accuracy. While batch size regulation aims to assign diverse and suitable batch sizes for heterogeneous workers to improve training efficiency. Moreover, MergeSFL explores to jointly optimize these two strategies upon their coupled relationship to better enhance the performance of SFL. Extensive experiments are conducted on a physical platform with 80 NVIDIA Jetson edge devices, and the experimental results show that MergeSFL can improve the final model accuracy by 5.82% to 26.22%, with a speedup by about 1.74x to 4.14x, compared to the baselines.
