Table of Contents
Fetching ...

Accuracy vs Performance: An abstraction model for deadline constrained offloading at the mobile-edge

Jamie Cotter, Ignacio Castineiras, Victor Cionca

TL;DR

This paper addresses deadline-constrained DNN offloading on mobile-edge devices by introducing a lightweight scheduler that models resources as guaranteed windows and discretises the network link to speed up placement queries, with a dynamic bandwidth estimator guiding task placement. The core contributions are two data structures: a resource-availability model using windows $[t_1,t_2)$ and per-configuration cores, and a discretised network link with base unit $D$ plus an index-based bucket system, complemented by a bandwidth-update mechanism. The approach enables priority-aware pre-emption and achieves lower latency under high-load conditions, demonstrated on Raspberry Pi-based edge devices running a YOLOv2 TensorFlow Lite pipeline. The findings reveal trade-offs between latency, accuracy, and network probing overhead, suggesting adaptive bandwidth testing to balance performance and network congestion on commodity hardware.

Abstract

In this paper, we present a solution for low-latency deadline-constrained DNN offloading on mobile edge devices. We design a scheduling algorithm with lightweight network state representation, considering device availability, communication on the network link, priority-aware pre-emption, and task deadlines. The scheduling algorithm aims to reduce latency by designing a resource availability representation, as well as a network discretisation and a dynamic bandwidth estimation mechanism. We implement the scheduling algorithm into a system composed of four Raspberry Pi 2 (model Bs) mobile edge devices, sampling a waste classification conveyor belt at a set frame rate. The system is evaluated and compared to a previous approach of ours, which was proven to outcompete work-stealers and a non-pre-emption based scheduling heuristic under the aforementioned waste classification scenario. Our findings show the novel lower latency abstraction models yield better performance under high-volume workloads, with the dynamic bandwidth estimation assisting the task placement while, ultimately, increasing task throughput in times of resource scarcity.

Accuracy vs Performance: An abstraction model for deadline constrained offloading at the mobile-edge

TL;DR

This paper addresses deadline-constrained DNN offloading on mobile-edge devices by introducing a lightweight scheduler that models resources as guaranteed windows and discretises the network link to speed up placement queries, with a dynamic bandwidth estimator guiding task placement. The core contributions are two data structures: a resource-availability model using windows and per-configuration cores, and a discretised network link with base unit plus an index-based bucket system, complemented by a bandwidth-update mechanism. The approach enables priority-aware pre-emption and achieves lower latency under high-load conditions, demonstrated on Raspberry Pi-based edge devices running a YOLOv2 TensorFlow Lite pipeline. The findings reveal trade-offs between latency, accuracy, and network probing overhead, suggesting adaptive bandwidth testing to balance performance and network congestion on commodity hardware.

Abstract

In this paper, we present a solution for low-latency deadline-constrained DNN offloading on mobile edge devices. We design a scheduling algorithm with lightweight network state representation, considering device availability, communication on the network link, priority-aware pre-emption, and task deadlines. The scheduling algorithm aims to reduce latency by designing a resource availability representation, as well as a network discretisation and a dynamic bandwidth estimation mechanism. We implement the scheduling algorithm into a system composed of four Raspberry Pi 2 (model Bs) mobile edge devices, sampling a waste classification conveyor belt at a set frame rate. The system is evaluated and compared to a previous approach of ours, which was proven to outcompete work-stealers and a non-pre-emption based scheduling heuristic under the aforementioned waste classification scenario. Our findings show the novel lower latency abstraction models yield better performance under high-volume workloads, with the dynamic bandwidth estimation assisting the task placement while, ultimately, increasing task throughput in times of resource scarcity.

Paper Structure

This paper contains 19 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Task Pipeline
  • Figure 2: Example of writing to a resource availability data structure
  • Figure 3: An example of the discretised network link
  • Figure 4: Task Completion across various categories.
  • Figure 5: Scheduling latency by initial allocation and pre-emption/reallocation scenarios for both schedulers.
  • ...and 3 more figures