Split Learning in Computer Vision for Semantic Segmentation Delay Minimization
Nikos G. Evgenidis, Nikos A. Mitsiou, Sotiris A. Tegos, Panagiotis D. Diamantoulakis, George K. Karagiannidis
TL;DR
This work tackles the latency bottleneck of real-time semantic segmentation in resource-constrained networks by applying split learning to partition CNNs between edge devices and a central server. It develops a joint optimization framework over bandwidth allocation, cut-layer selection, and server-resource allocation, and examines both parallel and serial processing modes. The authors introduce low-complexity heuristics and alternating optimization to achieve near-optimal delay reductions, and demonstrate substantial latency improvements in numerical experiments with BM-aware CNNs like ENet. The results indicate that SL can enable responsive, distributed CV applications in dynamic edge–cloud environments, with practical implications for autonomous systems and smart infrastructure.
Abstract
In this paper, we propose a novel approach to minimize the inference delay in semantic segmentation using split learning (SL), tailored to the needs of real-time computer vision (CV) applications for resource-constrained devices. Semantic segmentation is essential for applications such as autonomous vehicles and smart city infrastructure, but faces significant latency challenges due to high computational and communication loads. Traditional centralized processing methods are inefficient for such scenarios, often resulting in unacceptable inference delays. SL offers a promising alternative by partitioning deep neural networks (DNNs) between edge devices and a central server, enabling localized data processing and reducing the amount of data required for transmission. Our contribution includes the joint optimization of bandwidth allocation, cut layer selection of the edge devices' DNN, and the central server's processing resource allocation. We investigate both parallel and serial data processing scenarios and propose low-complexity heuristic solutions that maintain near-optimal performance while reducing computational requirements. Numerical results show that our approach effectively reduces inference delay, demonstrating the potential of SL for improving real-time CV applications in dynamic, resource-constrained environments.
