Accuracy vs Performance: An abstraction model for deadline constrained offloading at the mobile-edge

Jamie Cotter; Ignacio Castineiras; Victor Cionca

Accuracy vs Performance: An abstraction model for deadline constrained offloading at the mobile-edge

Jamie Cotter, Ignacio Castineiras, Victor Cionca

TL;DR

This paper addresses deadline-constrained DNN offloading on mobile-edge devices by introducing a lightweight scheduler that models resources as guaranteed windows and discretises the network link to speed up placement queries, with a dynamic bandwidth estimator guiding task placement. The core contributions are two data structures: a resource-availability model using windows $[t_1,t_2)$ and per-configuration cores, and a discretised network link with base unit $D$ plus an index-based bucket system, complemented by a bandwidth-update mechanism. The approach enables priority-aware pre-emption and achieves lower latency under high-load conditions, demonstrated on Raspberry Pi-based edge devices running a YOLOv2 TensorFlow Lite pipeline. The findings reveal trade-offs between latency, accuracy, and network probing overhead, suggesting adaptive bandwidth testing to balance performance and network congestion on commodity hardware.

Abstract

In this paper, we present a solution for low-latency deadline-constrained DNN offloading on mobile edge devices. We design a scheduling algorithm with lightweight network state representation, considering device availability, communication on the network link, priority-aware pre-emption, and task deadlines. The scheduling algorithm aims to reduce latency by designing a resource availability representation, as well as a network discretisation and a dynamic bandwidth estimation mechanism. We implement the scheduling algorithm into a system composed of four Raspberry Pi 2 (model Bs) mobile edge devices, sampling a waste classification conveyor belt at a set frame rate. The system is evaluated and compared to a previous approach of ours, which was proven to outcompete work-stealers and a non-pre-emption based scheduling heuristic under the aforementioned waste classification scenario. Our findings show the novel lower latency abstraction models yield better performance under high-volume workloads, with the dynamic bandwidth estimation assisting the task placement while, ultimately, increasing task throughput in times of resource scarcity.

Accuracy vs Performance: An abstraction model for deadline constrained offloading at the mobile-edge

TL;DR

Abstract

Accuracy vs Performance: An abstraction model for deadline constrained offloading at the mobile-edge

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)