The Impact of Cut Layer Selection in Split Federated Learning
Justin Dachille, Chao Huang, Xin Liu
TL;DR
This paper analyzes how cut-layer placement affects Split Federated Learning (SFL) performance, introducing two variants: SFL-V1 with per-client server-side models and SFL-V2 with a single shared server-side model. It provides a convergence analysis showing that SFL-V1 is invariant to cut-layer location, supported by a bound that depends on data heterogeneity and gradient variance but not on the cut point, and presents extensive numerical results across four datasets and two architectures. The experiments reveal that SFL-V2 is highly sensitive to the cut-layer choice, with early cuts often yielding the best performance and enabling SFL-V2 to outperform FedAvg on heterogeneous data. The findings highlight SFL-V1’s robustness and SFL-V2’s potential for performance gains when cut-layer placement is carefully chosen, offering practical guidance for deploying privacy-preserving distributed learning systems in non-IID environments.
Abstract
Split Federated Learning (SFL) is a distributed machine learning paradigm that combines federated learning and split learning. In SFL, a neural network is partitioned at a cut layer, with the initial layers deployed on clients and remaining layers on a training server. There are two main variants of SFL: SFL-V1 where the training server maintains separate server-side models for each client, and SFL-V2 where the training server maintains a single shared model for all clients. While existing studies have focused on algorithm development for SFL, a comprehensive quantitative analysis of how the cut layer selection affects model performance remains unexplored. This paper addresses this gap by providing numerical and theoretical analysis of SFL performance and convergence relative to cut layer selection. We find that SFL-V1 is relatively invariant to the choice of cut layer, which is consistent with our theoretical results. Numerical experiments on four datasets and two neural networks show that the cut layer selection significantly affects the performance of SFL-V2. Moreover, SFL-V2 with an appropriate cut layer selection outperforms FedAvg on heterogeneous data.
