Convergence Analysis of Split Federated Learning on Heterogeneous Data
Pengchao Han, Chao Huang, Geng Tian, Ming Tang, Xin Liu
TL;DR
This work provides the first convergence analysis for Split Federated Learning (SFL) on heterogeneous data, introducing a server/client decomposition that decouples server-side and client-side errors. It establishes $O(1/T)$ rates for strongly convex objectives and $O(1/\sqrt[3]{T})$ for general convex objectives, with extensions to non-convex settings and partial client participation; the results align with existing distributed learning bounds while highlighting SFL’s practical advantages. Empirical results on CIFAR-10/100 demonstrate that SFL, particularly the V2 variant, can outperform FL and SL in highly non-IID and large-client regimes, and offer guidance on cut-layer placement and participation strategies. These findings provide a principled basis for deploying SFL in real-world distributed learning systems where data heterogeneity and device availability are prominent concerns.
Abstract
Split federated learning (SFL) is a recent distributed approach for collaborative model training among multiple clients. In SFL, a global model is typically split into two parts, where clients train one part in a parallel federated manner, and a main server trains the other. Despite the recent research on SFL algorithm development, the convergence analysis of SFL is missing in the literature, and this paper aims to fill this gap. The analysis of SFL can be more challenging than that of federated learning (FL), due to the potential dual-paced updates at the clients and the main server. We provide convergence analysis of SFL for strongly convex and general convex objectives on heterogeneous data. The convergence rates are $O(1/T)$ and $O(1/\sqrt[3]{T})$, respectively, where $T$ denotes the total number of rounds for SFL training. We further extend the analysis to non-convex objectives and the scenario where some clients may be unavailable during training. Experimental experiments validate our theoretical results and show that SFL outperforms FL and split learning (SL) when data is highly heterogeneous across a large number of clients.
