Convergence Analysis of Split Federated Learning on Heterogeneous Data

Pengchao Han; Chao Huang; Geng Tian; Ming Tang; Xin Liu

Convergence Analysis of Split Federated Learning on Heterogeneous Data

Pengchao Han, Chao Huang, Geng Tian, Ming Tang, Xin Liu

TL;DR

This work provides the first convergence analysis for Split Federated Learning (SFL) on heterogeneous data, introducing a server/client decomposition that decouples server-side and client-side errors. It establishes $O(1/T)$ rates for strongly convex objectives and $O(1/\sqrt[3]{T})$ for general convex objectives, with extensions to non-convex settings and partial client participation; the results align with existing distributed learning bounds while highlighting SFL’s practical advantages. Empirical results on CIFAR-10/100 demonstrate that SFL, particularly the V2 variant, can outperform FL and SL in highly non-IID and large-client regimes, and offer guidance on cut-layer placement and participation strategies. These findings provide a principled basis for deploying SFL in real-world distributed learning systems where data heterogeneity and device availability are prominent concerns.

Abstract

Split federated learning (SFL) is a recent distributed approach for collaborative model training among multiple clients. In SFL, a global model is typically split into two parts, where clients train one part in a parallel federated manner, and a main server trains the other. Despite the recent research on SFL algorithm development, the convergence analysis of SFL is missing in the literature, and this paper aims to fill this gap. The analysis of SFL can be more challenging than that of federated learning (FL), due to the potential dual-paced updates at the clients and the main server. We provide convergence analysis of SFL for strongly convex and general convex objectives on heterogeneous data. The convergence rates are $O(1/T)$ and $O(1/\sqrt[3]{T})$, respectively, where $T$ denotes the total number of rounds for SFL training. We further extend the analysis to non-convex objectives and the scenario where some clients may be unavailable during training. Experimental experiments validate our theoretical results and show that SFL outperforms FL and split learning (SL) when data is highly heterogeneous across a large number of clients.

Convergence Analysis of Split Federated Learning on Heterogeneous Data

TL;DR

rates for strongly convex objectives and

for general convex objectives, with extensions to non-convex settings and partial client participation; the results align with existing distributed learning bounds while highlighting SFL’s practical advantages. Empirical results on CIFAR-10/100 demonstrate that SFL, particularly the V2 variant, can outperform FL and SL in highly non-IID and large-client regimes, and offer guidance on cut-layer placement and participation strategies. These findings provide a principled basis for deploying SFL in real-world distributed learning systems where data heterogeneity and device availability are prominent concerns.

Abstract

and

, respectively, where

denotes the total number of rounds for SFL training. We further extend the analysis to non-convex objectives and the scenario where some clients may be unavailable during training. Experimental experiments validate our theoretical results and show that SFL outperforms FL and split learning (SL) when data is highly heterogeneous across a large number of clients.

Paper Structure (87 sections, 12 theorems, 166 equations, 13 figures, 4 tables, 2 algorithms)

This paper contains 87 sections, 12 theorems, 166 equations, 13 figures, 4 tables, 2 algorithms.

Introduction
Motivation
Related Work
Challenges and Contributions
Problem Formulation
Model
Algorithm Description
Client Participation
Convergence Analysis
Assumptions
Decomposition
Results under Full Participation
Results under Partial Participation
Experimental Results
Setup
...and 72 more sections

Key Result

Proposition 3.5

(Convergence decomposition) Let $\boldsymbol{x}^*\triangleq[\boldsymbol{x}_c^*; \boldsymbol{x}_s^*]$ denote the optimal global model that minimizes $f(\cdot)$, and $\boldsymbol{x}^T\triangleq[\boldsymbol{x}_c^T; \boldsymbol{x}_s^T]$ is the global model obtained after $T$ rounds of SFL training. Unde

Figures (13)

Figure 1: An illustration of SFL framework, and there are two major algorithms, i.e., SFL-V1 (left) and SFL-V2 (right) thapa2022splitfed. More discussions on SFL-V1 and SFL-V2 are given in Sec. \ref{['sec: formulation']}.
Figure 2: Impact of the choice of cut layer on SFL performance.
Figure 3: Impact of data heterogeneity on SFL performance.
Figure 4: Impact of client participation on SFL performance.
Figure 5: Performance comparison on CIFAR-10.
...and 8 more figures

Theorems & Definitions (19)

Proposition 3.5
Theorem 3.6
Theorem 3.7
Theorem 3.8
Theorem 3.9
Lemma C.3
proof
Proposition C.4: Decomposition in each round
proof
Lemma C.5
...and 9 more

Convergence Analysis of Split Federated Learning on Heterogeneous Data

TL;DR

Abstract

Convergence Analysis of Split Federated Learning on Heterogeneous Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (19)