Non-Federated Multi-Task Split Learning for Heterogeneous Sources
Yilin Zheng, Atilla Eryilmaz
TL;DR
This work tackles heterogeneity in edge-network data by moving from Federated Learning to Multi-Task Split Learning (MTSL), where each task m maintains its own model F_m split between a server component G(φ, ·) and a client component H_m(ψ_m, ·). The framework avoids explicit gradient federation and communicates smashed data and partial gradients, enabling per-task LR tuning and potential gains in convergence speed and communication efficiency; convergence bounds are provided for convex and non-convex objectives under Lipschitz-gradient assumptions, with SGD results also discussed. Empirical results on MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100 show that MTSL achieves higher multi-task accuracy and greater robustness to heterogeneity and noise, while reducing training steps and data transmission compared to FedAvg, FedEM, and SplitFed, particularly when data are highly non-i.i.d. The work positions MTSL as a practical alternative to FL in highly heterogeneous edge environments and outlines future directions for privacy-preserving enhancements and dynamic adaptation to data-source heterogeneity.
Abstract
With the development of edge networks and mobile computing, the need to serve heterogeneous data sources at the network edge requires the design of new distributed machine learning mechanisms. As a prevalent approach, Federated Learning (FL) employs parameter-sharing and gradient-averaging between clients and a server. Despite its many favorable qualities, such as convergence and data-privacy guarantees, it is well-known that classic FL fails to address the challenge of data heterogeneity and computation heterogeneity across clients. Most existing works that aim to accommodate such sources of heterogeneity stay within the FL operation paradigm, with modifications to overcome the negative effect of heterogeneous data. In this work, as an alternative paradigm, we propose a Multi-Task Split Learning (MTSL) framework, which combines the advantages of Split Learning (SL) with the flexibility of distributed network architectures. In contrast to the FL counterpart, in this paradigm, heterogeneity is not an obstacle to overcome, but a useful property to take advantage of. As such, this work aims to introduce a new architecture and methodology to perform multi-task learning for heterogeneous data sources efficiently, with the hope of encouraging the community to further explore the potential advantages we reveal. To support this promise, we first show through theoretical analysis that MTSL can achieve fast convergence by tuning the learning rate of the server and clients. Then, we compare the performance of MTSL with existing multi-task FL methods numerically on several image classification datasets to show that MTSL has advantages over FL in training speed, communication cost, and robustness to heterogeneous data.
