Table of Contents
Fetching ...

Collaborative Split Federated Learning with Parallel Training and Aggregation

Yiannis Papageorgiou, Yannis Thomas, Alexios Filippakopoulos, Ramin Khalili, Iordanis Koutsopoulos

TL;DR

This paper tackles the training delay and communication overhead in Federated Learning by introducing Collaborative Split Federated Learning (C-SFL), which partitions a neural network into three parts across two layers: a weak-side portion trained on computationally weak clients, an aggregator-side portion trained by local aggregators, and a server-side portion trained on the server. The architecture enables parallel FP/BP and aggregation across weak clients, local aggregators, and the server, using a local loss at the cut layer to accelerate updates without sacrificing accuracy. The two split layers, $h$ and $v$, are selected through an exhaustive search with complexity $O(V^2)$, and the authors provide a delay decomposition $D_{round}=D_0+E\cdot B\cdot(D_1+D_2)+D_3$ to quantify performance gains. Empirical results on MNIST, FMNIST, and CIFAR-10 show that C-SFL reduces training delay and communication overhead while achieving higher accuracy than standard SFL and LocSplitFed, particularly under high client heterogeneity and constrained transmission rates, indicating strong practical impact for heterogeneous edge environments.

Abstract

Federated learning (FL) operates based on model exchanges between the server and the clients, and it suffers from significant client-side computation and communication burden. Split federated learning (SFL) arises a promising solution by splitting the model into two parts, that are trained sequentially: the clients train the first part of the model (client-side model) and transmit it to the server that trains the second (server-side model). Existing SFL schemes though still exhibit long training delays and significant communication overhead, especially when clients of different computing capability participate. Thus, we propose Collaborative-Split Federated Learning~(C-SFL), a novel scheme that splits the model into three parts, namely the model parts trained at the computationally weak clients, the ones trained at the computationally strong clients, and the ones at the server. Unlike existing works, C-SFL enables parallel training and aggregation of model's parts at the clients and at the server, resulting in reduced training delays and commmunication overhead while improving the model's accuracy. Experiments verify the multiple gains of C-SFL against the existing schemes.

Collaborative Split Federated Learning with Parallel Training and Aggregation

TL;DR

This paper tackles the training delay and communication overhead in Federated Learning by introducing Collaborative Split Federated Learning (C-SFL), which partitions a neural network into three parts across two layers: a weak-side portion trained on computationally weak clients, an aggregator-side portion trained by local aggregators, and a server-side portion trained on the server. The architecture enables parallel FP/BP and aggregation across weak clients, local aggregators, and the server, using a local loss at the cut layer to accelerate updates without sacrificing accuracy. The two split layers, and , are selected through an exhaustive search with complexity , and the authors provide a delay decomposition to quantify performance gains. Empirical results on MNIST, FMNIST, and CIFAR-10 show that C-SFL reduces training delay and communication overhead while achieving higher accuracy than standard SFL and LocSplitFed, particularly under high client heterogeneity and constrained transmission rates, indicating strong practical impact for heterogeneous edge environments.

Abstract

Federated learning (FL) operates based on model exchanges between the server and the clients, and it suffers from significant client-side computation and communication burden. Split federated learning (SFL) arises a promising solution by splitting the model into two parts, that are trained sequentially: the clients train the first part of the model (client-side model) and transmit it to the server that trains the second (server-side model). Existing SFL schemes though still exhibit long training delays and significant communication overhead, especially when clients of different computing capability participate. Thus, we propose Collaborative-Split Federated Learning~(C-SFL), a novel scheme that splits the model into three parts, namely the model parts trained at the computationally weak clients, the ones trained at the computationally strong clients, and the ones at the server. Unlike existing works, C-SFL enables parallel training and aggregation of model's parts at the clients and at the server, resulting in reduced training delays and commmunication overhead while improving the model's accuracy. Experiments verify the multiple gains of C-SFL against the existing schemes.

Paper Structure

This paper contains 13 sections, 5 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Training workflow for three clients, with client 2 selected as the local aggregator. Clients perform FP on their weak-side models, and they send activations to the aggregator (steps 1 & 2), which continues FP until the cut layer (step 3). The aggregator transmits the activations to the server (step 4) and then the BP process on the aggregator-side (step 5) and weak-side models (step 6), using the local loss at the cut layer, is initiated. In parallel, the server executes FP and BP (steps 5 & 6). Finally, the server-side and aggregator-side models are aggregated also in parallel (step 7).
  • Figure 2: Test accuracy versus training delay.
  • Figure 3: Test accuracy versus communication overhead.
  • Figure 4: Effect of client heterogeneity ratio ($\gamma$) and transmission rate $R$ during the training on the FMNIST. C-SFL (Ours) is mostly beneficial when the client heterogeneity is high and the transmission rate is small (e.g. mobile/IoT devices).