Table of Contents
Fetching ...

Adaptive Workload Distribution for Accuracy-aware DNN Inference on Collaborative Edge Platforms

Zain Taufique, Antonio Miele, Pasi Liljeberg, Anil Kanduri

TL;DR

This work proposes adaptive workload distribution for DNN inference, jointly considering node-level heterogeneity of edge devices, and application-specific accuracy and performance requirements, and combinatorially optimizes heterogeneity-aware workload partitioning and dynamic accuracy configuration of DNN models to ensure performance and accuracy guarantees.

Abstract

DNN inference can be accelerated by distributing the workload among a cluster of collaborative edge nodes. Heterogeneity among edge devices and accuracy-performance trade-offs of DNN models present a complex exploration space while catering to the inference performance requirements. In this work, we propose adaptive workload distribution for DNN inference, jointly considering node-level heterogeneity of edge devices, and application-specific accuracy and performance requirements. Our proposed approach combinatorially optimizes heterogeneity-aware workload partitioning and dynamic accuracy configuration of DNN models to ensure performance and accuracy guarantees. We tested our approach on an edge cluster of Odroid XU4, Raspberry Pi4, and Jetson Nano boards and achieved an average gain of 41.52% in performance and 5.2% in output accuracy as compared to state-of-the-art workload distribution strategies.

Adaptive Workload Distribution for Accuracy-aware DNN Inference on Collaborative Edge Platforms

TL;DR

This work proposes adaptive workload distribution for DNN inference, jointly considering node-level heterogeneity of edge devices, and application-specific accuracy and performance requirements, and combinatorially optimizes heterogeneity-aware workload partitioning and dynamic accuracy configuration of DNN models to ensure performance and accuracy guarantees.

Abstract

DNN inference can be accelerated by distributing the workload among a cluster of collaborative edge nodes. Heterogeneity among edge devices and accuracy-performance trade-offs of DNN models present a complex exploration space while catering to the inference performance requirements. In this work, we propose adaptive workload distribution for DNN inference, jointly considering node-level heterogeneity of edge devices, and application-specific accuracy and performance requirements. Our proposed approach combinatorially optimizes heterogeneity-aware workload partitioning and dynamic accuracy configuration of DNN models to ensure performance and accuracy guarantees. We tested our approach on an edge cluster of Odroid XU4, Raspberry Pi4, and Jetson Nano boards and achieved an average gain of 41.52% in performance and 5.2% in output accuracy as compared to state-of-the-art workload distribution strategies.
Paper Structure (12 sections, 9 figures, 1 table, 1 algorithm)

This paper contains 12 sections, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: Accuracy-perfomance trade-offs of inferring MobileNetV2 across different edge devices.
  • Figure 2: Workload distribution through different strategies. (a) workload distribution and selected approximation level on each board. (b). Overall output performance and accuracy.
  • Figure 3: System Diagram showing (a) edge cluster (b) the node-level software and hardware modules
  • Figure 4: State Machine of gn and ln Resource Manager.
  • Figure 5: Workload distribution approach.
  • ...and 4 more figures