Table of Contents
Fetching ...

Split Learning in Computer Vision for Semantic Segmentation Delay Minimization

Nikos G. Evgenidis, Nikos A. Mitsiou, Sotiris A. Tegos, Panagiotis D. Diamantoulakis, George K. Karagiannidis

TL;DR

This work tackles the latency bottleneck of real-time semantic segmentation in resource-constrained networks by applying split learning to partition CNNs between edge devices and a central server. It develops a joint optimization framework over bandwidth allocation, cut-layer selection, and server-resource allocation, and examines both parallel and serial processing modes. The authors introduce low-complexity heuristics and alternating optimization to achieve near-optimal delay reductions, and demonstrate substantial latency improvements in numerical experiments with BM-aware CNNs like ENet. The results indicate that SL can enable responsive, distributed CV applications in dynamic edge–cloud environments, with practical implications for autonomous systems and smart infrastructure.

Abstract

In this paper, we propose a novel approach to minimize the inference delay in semantic segmentation using split learning (SL), tailored to the needs of real-time computer vision (CV) applications for resource-constrained devices. Semantic segmentation is essential for applications such as autonomous vehicles and smart city infrastructure, but faces significant latency challenges due to high computational and communication loads. Traditional centralized processing methods are inefficient for such scenarios, often resulting in unacceptable inference delays. SL offers a promising alternative by partitioning deep neural networks (DNNs) between edge devices and a central server, enabling localized data processing and reducing the amount of data required for transmission. Our contribution includes the joint optimization of bandwidth allocation, cut layer selection of the edge devices' DNN, and the central server's processing resource allocation. We investigate both parallel and serial data processing scenarios and propose low-complexity heuristic solutions that maintain near-optimal performance while reducing computational requirements. Numerical results show that our approach effectively reduces inference delay, demonstrating the potential of SL for improving real-time CV applications in dynamic, resource-constrained environments.

Split Learning in Computer Vision for Semantic Segmentation Delay Minimization

TL;DR

This work tackles the latency bottleneck of real-time semantic segmentation in resource-constrained networks by applying split learning to partition CNNs between edge devices and a central server. It develops a joint optimization framework over bandwidth allocation, cut-layer selection, and server-resource allocation, and examines both parallel and serial processing modes. The authors introduce low-complexity heuristics and alternating optimization to achieve near-optimal delay reductions, and demonstrate substantial latency improvements in numerical experiments with BM-aware CNNs like ENet. The results indicate that SL can enable responsive, distributed CV applications in dynamic edge–cloud environments, with practical implications for autonomous systems and smart infrastructure.

Abstract

In this paper, we propose a novel approach to minimize the inference delay in semantic segmentation using split learning (SL), tailored to the needs of real-time computer vision (CV) applications for resource-constrained devices. Semantic segmentation is essential for applications such as autonomous vehicles and smart city infrastructure, but faces significant latency challenges due to high computational and communication loads. Traditional centralized processing methods are inefficient for such scenarios, often resulting in unacceptable inference delays. SL offers a promising alternative by partitioning deep neural networks (DNNs) between edge devices and a central server, enabling localized data processing and reducing the amount of data required for transmission. Our contribution includes the joint optimization of bandwidth allocation, cut layer selection of the edge devices' DNN, and the central server's processing resource allocation. We investigate both parallel and serial data processing scenarios and propose low-complexity heuristic solutions that maintain near-optimal performance while reducing computational requirements. Numerical results show that our approach effectively reduces inference delay, demonstrating the potential of SL for improving real-time CV applications in dynamic, resource-constrained environments.

Paper Structure

This paper contains 15 sections, 2 theorems, 19 equations, 9 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

A solution of eq:equationClosed exists in the interval $(0, f_b)$, where so that the optimal feasible resource allocation solution for all devices can always be obtained by eq:equalCons.

Figures (9)

  • Figure 1: Typical design of a BM.
  • Figure 2: A CV semantic segmentation DNN architecture consisting of multiple stacks of BMs of different characteristics.
  • Figure 3: Different possible instances of broken queues. Arrows symbolize the order of $C_k$ and rectangles their corresponding $F_k$. In the initial queue, red arrivals indicate the first element of each unbroken sub-queue, green arrivals the last element of each unbroken sub-queue and blue arrivals the intermediate elements of each sub-queue.
  • Figure 4: Average maximum delay versus number of devices.
  • Figure 5: Average maximum delay versus transmit power.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Lemma 1
  • Definition 1
  • Lemma 2