Table of Contents
Fetching ...

The Impact of Cut Layer Selection in Split Federated Learning

Justin Dachille, Chao Huang, Xin Liu

TL;DR

This paper analyzes how cut-layer placement affects Split Federated Learning (SFL) performance, introducing two variants: SFL-V1 with per-client server-side models and SFL-V2 with a single shared server-side model. It provides a convergence analysis showing that SFL-V1 is invariant to cut-layer location, supported by a bound that depends on data heterogeneity and gradient variance but not on the cut point, and presents extensive numerical results across four datasets and two architectures. The experiments reveal that SFL-V2 is highly sensitive to the cut-layer choice, with early cuts often yielding the best performance and enabling SFL-V2 to outperform FedAvg on heterogeneous data. The findings highlight SFL-V1’s robustness and SFL-V2’s potential for performance gains when cut-layer placement is carefully chosen, offering practical guidance for deploying privacy-preserving distributed learning systems in non-IID environments.

Abstract

Split Federated Learning (SFL) is a distributed machine learning paradigm that combines federated learning and split learning. In SFL, a neural network is partitioned at a cut layer, with the initial layers deployed on clients and remaining layers on a training server. There are two main variants of SFL: SFL-V1 where the training server maintains separate server-side models for each client, and SFL-V2 where the training server maintains a single shared model for all clients. While existing studies have focused on algorithm development for SFL, a comprehensive quantitative analysis of how the cut layer selection affects model performance remains unexplored. This paper addresses this gap by providing numerical and theoretical analysis of SFL performance and convergence relative to cut layer selection. We find that SFL-V1 is relatively invariant to the choice of cut layer, which is consistent with our theoretical results. Numerical experiments on four datasets and two neural networks show that the cut layer selection significantly affects the performance of SFL-V2. Moreover, SFL-V2 with an appropriate cut layer selection outperforms FedAvg on heterogeneous data.

The Impact of Cut Layer Selection in Split Federated Learning

TL;DR

This paper analyzes how cut-layer placement affects Split Federated Learning (SFL) performance, introducing two variants: SFL-V1 with per-client server-side models and SFL-V2 with a single shared server-side model. It provides a convergence analysis showing that SFL-V1 is invariant to cut-layer location, supported by a bound that depends on data heterogeneity and gradient variance but not on the cut point, and presents extensive numerical results across four datasets and two architectures. The experiments reveal that SFL-V2 is highly sensitive to the cut-layer choice, with early cuts often yielding the best performance and enabling SFL-V2 to outperform FedAvg on heterogeneous data. The findings highlight SFL-V1’s robustness and SFL-V2’s potential for performance gains when cut-layer placement is carefully chosen, offering practical guidance for deploying privacy-preserving distributed learning systems in non-IID environments.

Abstract

Split Federated Learning (SFL) is a distributed machine learning paradigm that combines federated learning and split learning. In SFL, a neural network is partitioned at a cut layer, with the initial layers deployed on clients and remaining layers on a training server. There are two main variants of SFL: SFL-V1 where the training server maintains separate server-side models for each client, and SFL-V2 where the training server maintains a single shared model for all clients. While existing studies have focused on algorithm development for SFL, a comprehensive quantitative analysis of how the cut layer selection affects model performance remains unexplored. This paper addresses this gap by providing numerical and theoretical analysis of SFL performance and convergence relative to cut layer selection. We find that SFL-V1 is relatively invariant to the choice of cut layer, which is consistent with our theoretical results. Numerical experiments on four datasets and two neural networks show that the cut layer selection significantly affects the performance of SFL-V2. Moreover, SFL-V2 with an appropriate cut layer selection outperforms FedAvg on heterogeneous data.

Paper Structure

This paper contains 31 sections, 3 theorems, 29 equations, 1 figure, 4 tables, 2 algorithms.

Key Result

Proposition 1

(Convergence Invariability to Cut Layer Selection in SFL-V1) Let Assumptions assump: non-convexity-assump: heterogeneity hold, and let $\eta^t \leq \min\left\{ \frac{1}{16S\tau}, \frac{1}{8SK\tau\sum_{k=1}^K\alpha_k^2}\right\}$. Then, for any $L_c\in \{1, 2, \cdots, L-1\}$, the following inequality where $\theta^{\ast}$ is the optimal global model, $E$ is the number of epochs, and $\tau\triangleq

Figures (1)

  • Figure 1: Comparison of distributed learning architectures. The Model Sync Server maintains model consistency across clients, while the Training Server handles model computations. In FL (top left), clients train complete model copies locally and periodically synchronize with the Model Sync Server (1). In SL (top right), the model $\theta$ is split at a cut layer into client-side ($\theta^c$) and server-side ($\theta^s$) components, where clients sequentially take turns: each client computes forward activations up to the cut layer (2), the server completes the forward pass and backpropagation to the cut layer, and the client finishes the backward pass using returned gradients (3). SFL comes in two variants: SFL-V1 (bottom left) and SFL-V2 (bottom right). Both variants split the model and maintain client-side synchronization through FedAvg (1), but differ in server-side processing: after clients send activations (2) and receive gradients (3), SFL-V1 aggregates both client and server-side models, whereas SFL-V2 only aggregates client-side models.

Theorems & Definitions (5)

  • Proposition 1
  • Lemma 1
  • proof
  • Lemma 2
  • proof