Table of Contents
Fetching ...

MTL-Split: Multi-Task Learning for Edge Devices using Split Computing

Luigi Capogrosso, Enrico Fraccaroli, Samarjit Chakraborty, Franco Fummi, Marco Cristani

TL;DR

MTL-Split addresses the challenge of running multi-task deep inference on resource-constrained edge devices by combining Split Computing with Multi-Task Learning. The architecture places a shared backbone on the edge and task-solving heads on a remote server, producing a lightweight shared feature $Z_b$ to reduce data transfer while enabling multiple tasks to be solved jointly. Empirical results across synthetic and real datasets show accuracy gains for all tasks and substantial reductions in communication and latency compared to fully local or fully remote setups. This approach has practical implications for edge AI in latency-sensitive domains like automotive, where bandwidth and compute constraints are critical and multi-task inference is common.

Abstract

Split Computing (SC), where a Deep Neural Network (DNN) is intelligently split with a part of it deployed on an edge device and the rest on a remote server is emerging as a promising approach. It allows the power of DNNs to be leveraged for latency-sensitive applications that do not allow the entire DNN to be deployed remotely, while not having sufficient computation bandwidth available locally. In many such embedded systems scenarios, such as those in the automotive domain, computational resource constraints also necessitate Multi-Task Learning (MTL), where the same DNN is used for multiple inference tasks instead of having dedicated DNNs for each task, which would need more computing bandwidth. However, how to partition such a multi-tasking DNN to be deployed within a SC framework has not been sufficiently studied. This paper studies this problem, and MTL-Split, our novel proposed architecture, shows encouraging results on both synthetic and real-world data. The source code is available at https://github.com/intelligolabs/MTL-Split.

MTL-Split: Multi-Task Learning for Edge Devices using Split Computing

TL;DR

MTL-Split addresses the challenge of running multi-task deep inference on resource-constrained edge devices by combining Split Computing with Multi-Task Learning. The architecture places a shared backbone on the edge and task-solving heads on a remote server, producing a lightweight shared feature to reduce data transfer while enabling multiple tasks to be solved jointly. Empirical results across synthetic and real datasets show accuracy gains for all tasks and substantial reductions in communication and latency compared to fully local or fully remote setups. This approach has practical implications for edge AI in latency-sensitive domains like automotive, where bandwidth and compute constraints are critical and multi-task inference is common.

Abstract

Split Computing (SC), where a Deep Neural Network (DNN) is intelligently split with a part of it deployed on an edge device and the rest on a remote server is emerging as a promising approach. It allows the power of DNNs to be leveraged for latency-sensitive applications that do not allow the entire DNN to be deployed remotely, while not having sufficient computation bandwidth available locally. In many such embedded systems scenarios, such as those in the automotive domain, computational resource constraints also necessitate Multi-Task Learning (MTL), where the same DNN is used for multiple inference tasks instead of having dedicated DNNs for each task, which would need more computing bandwidth. However, how to partition such a multi-tasking DNN to be deployed within a SC framework has not been sufficiently studied. This paper studies this problem, and MTL-Split, our novel proposed architecture, shows encouraging results on both synthetic and real-world data. The source code is available at https://github.com/intelligolabs/MTL-Split.
Paper Structure (23 sections, 7 equations, 1 figure, 4 tables)

This paper contains 23 sections, 7 equations, 1 figure, 4 tables.

Figures (1)

  • Figure 1: The proposed architecture for handling complex inference tasks on edge devices by integrating and .