SemiSFL: Split Federated Learning on Unlabeled and Non-IID Data
Yang Xu, Yunming Liao, Hongli Xu, Zhipeng Sun, Liusheng Huang, Chunming Qiao
TL;DR
This work tackles the practical problem of training large models in Split Federated Learning when data are both unlabeled and non-IID across clients. It introduces SemiSFL, a semi-supervised SFL framework that employs clustering regularization by projecting teacher features into pseudo clusters on the server to steer student features, along with a memory queue and projection head to enable cross-client knowledge sharing. A convergence analysis reveals how the interplay between supervised and semi-supervised updates affects learning, and a greedy adaptive algorithm dynamically tunes the global updating frequency to balance efficiency and accuracy. Empirical results across four real-world datasets show that SemiSFL achieves higher accuracy, substantially reduces training time and communication costs, and remains robust to varying data distributions and label availability. These findings demonstrate SemiSFL's potential to enable scalable, privacy-preserving training of large models on heterogeneous edge devices with limited labeling.
Abstract
Federated Learning (FL) has emerged to allow multiple clients to collaboratively train machine learning models on their private data at the network edge. However, training and deploying large-scale models on resource-constrained devices is challenging. Fortunately, Split Federated Learning (SFL) offers a feasible solution by alleviating the computation and/or communication burden on clients. However, existing SFL works often assume sufficient labeled data on clients, which is usually impractical. Besides, data non-IIDness poses another challenge to ensure efficient model training. To our best knowledge, the above two issues have not been simultaneously addressed in SFL. Herein, we propose a novel Semi-supervised SFL system, termed SemiSFL, which incorporates clustering regularization to perform SFL with unlabeled and non-IID client data. Moreover, our theoretical and experimental investigations into model convergence reveal that the inconsistent training processes on labeled and unlabeled data have an influence on the effectiveness of clustering regularization. To mitigate the training inconsistency, we develop an algorithm for dynamically adjusting the global updating frequency, so as to improve training performance. Extensive experiments on benchmark models and datasets show that our system provides a 3.8x speed-up in training time, reduces the communication cost by about 70.3% while reaching the target accuracy, and achieves up to 5.8% improvement in accuracy under non-IID scenarios compared to the state-of-the-art baselines.
