GAS: Generative Activation-Aided Asynchronous Split Federated Learning
Jiarong Yang, Yuan Liu
TL;DR
GAS addresses the latency and bias challenges of asynchronous Split Federated Learning by introducing activation and model buffers that decouple update timing from communication delays. A key innovation is generative activations: the server maintains label-wise activation distributions and samples from them to generate activations that balance updates, reducing gradient dissimilarity and encouraging more reliable server-side updates. Theoretical analysis yields tighter convergence bounds for both server- and client-side models, with a decaying learning rate helping to mitigate straggler effects over time. Empirical results on CIFAR-10, CINIC-10, and Fashion-MNIST under heterogeneous data demonstrate that GAS outperforms both asynchronous FL baselines and synchronous SFL methods in accuracy and convergence speed, highlighting its practical potential for real-world, heterogeneous networks.
Abstract
Split Federated Learning (SFL) splits and collaboratively trains a shared model between clients and server, where clients transmit activations and client-side models to server for updates. Recent SFL studies assume synchronous transmission of activations and client-side models from clients to server. However, due to significant variations in computational and communication capabilities among clients, activations and client-side models arrive at server asynchronously. The delay caused by asynchrony significantly degrades the performance of SFL. To address this issue, we consider an asynchronous SFL framework, where an activation buffer and a model buffer are embedded on the server to manage the asynchronously transmitted activations and client-side models, respectively. Furthermore, as asynchronous activation transmissions cause the buffer to frequently receive activations from resource-rich clients, leading to biased updates of the server-side model, we propose Generative activations-aided Asynchronous SFL (GAS). In GAS, the server maintains an activation distribution for each label based on received activations and generates activations from these distributions according to the degree of bias. These generative activations are then used to assist in updating the server-side model, ensuring more accurate updates. We derive a tighter convergence bound, and our experiments demonstrate the effectiveness of the proposed method. The code is available at https://github.com/eejiarong/GAS.
