Table of Contents
Fetching ...

Real-World Modeling of Computation Offloading for Neural Networks with Early Exits and Splits

Jan Danek, Zdenek Becvar, Adam Janes

TL;DR

This work tackles real-world computation offloading of CNNs for autonomous systems by combining early exits and split computing to MEC servers. It implements a VGG-16–style CNN with multiple exits and split points on an AV–MEC testbed and uses a 5G network to account for both uplink and downlink delays, along with autoencoder-based compression to reduce data transmission. The authors derive practical models for total processing time $t_n^{total}(E,S)$ and total energy $E_n^{total}(E,S)$ and validate them with road sign recognition on the GTSRB dataset, showing up to $4.2\times$ reductions in latency and $4.4\times$ reductions in energy when using early exits; MEC offloading dramatically speeds up processing compared to local AV execution. Overall, the paper provides a real-world, end-to-end assessment of CNN offloading with early exits and splits, including open-source code and data, and presents analytical models to guide future design and optimization in edge-enabled vision systems.

Abstract

We focus on computation offloading of applications based on convolutional neural network (CNN) from moving devices, such as mobile robots or autonomous vehicles, to MultiAccess Edge Computing (MEC) servers via a mobile network. In order to reduce overall CNN inference time, we design and implement CNN with early exits and splits, allowing a flexible partial or full offloading of CNN inference. Through real-world experiments, we analyze an impact of the CNN inference offloading on the total CNN processing delay, energy consumption, and classification accuracy in a practical road sign recognition task. The results confirm that offloading of CNN with early exits and splits can significantly reduce both total processing delay and energy consumption compared to full local processing while not impairing classification accuracy. Based on the results of real-world experiments, we derive practical models for energy consumption and total processing delay related to offloading of CNN with early exits and splits.

Real-World Modeling of Computation Offloading for Neural Networks with Early Exits and Splits

TL;DR

This work tackles real-world computation offloading of CNNs for autonomous systems by combining early exits and split computing to MEC servers. It implements a VGG-16–style CNN with multiple exits and split points on an AV–MEC testbed and uses a 5G network to account for both uplink and downlink delays, along with autoencoder-based compression to reduce data transmission. The authors derive practical models for total processing time and total energy and validate them with road sign recognition on the GTSRB dataset, showing up to reductions in latency and reductions in energy when using early exits; MEC offloading dramatically speeds up processing compared to local AV execution. Overall, the paper provides a real-world, end-to-end assessment of CNN offloading with early exits and splits, including open-source code and data, and presents analytical models to guide future design and optimization in edge-enabled vision systems.

Abstract

We focus on computation offloading of applications based on convolutional neural network (CNN) from moving devices, such as mobile robots or autonomous vehicles, to MultiAccess Edge Computing (MEC) servers via a mobile network. In order to reduce overall CNN inference time, we design and implement CNN with early exits and splits, allowing a flexible partial or full offloading of CNN inference. Through real-world experiments, we analyze an impact of the CNN inference offloading on the total CNN processing delay, energy consumption, and classification accuracy in a practical road sign recognition task. The results confirm that offloading of CNN with early exits and splits can significantly reduce both total processing delay and energy consumption compared to full local processing while not impairing classification accuracy. Based on the results of real-world experiments, we derive practical models for energy consumption and total processing delay related to offloading of CNN with early exits and splits.

Paper Structure

This paper contains 12 sections, 10 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: System model with AV either computing/processing CNN interference locally or offloading the computation to MEC server via mobile network.
  • Figure 2: Model of early exit CNN with several splits.
  • Figure 3: Data flows from the AV’s camera to either (partial) offloading or full local processing on the AV. In the offloading scenario, data is transmitted via a mobile network to the MEC server for further computation.
  • Figure 4: Impact of split point $S$ and early exit $E$ on classification accuracy $A(E,S)$.
  • Figure 5: Impact of split point $S$ and early exit $E$ on total processing delay $t_n^{total}(E,S)$. Whiskers represent a standard deviation (68% confidence).
  • ...and 3 more figures