Table of Contents
Fetching ...

HAWX: A Hardware-Aware FrameWork for Fast and Scalable ApproXimation of DNNs

Samira Nazari, Mohammad Saeed Almasi, Mahdi Taheri, Ali Azarpeyvand, Ali Mokhtari, Ali Mahani, Christian Herglotz

TL;DR

HAWX tackles the challenge of exploring heterogeneous approximate components in DNN accelerators by using a hardware-aware, multi-level sensitivity scoring framework. It profiles DNNs to compute hawx scores at operation, filter, layer, and model levels and couples them with predictive accuracy, power, and area models to prune the design space and generate Pareto-optimal configurations for both spatial and temporal accelerators. The results show dramatic speedups over exhaustive search (e.g., up to $23\times$ at layer level and over $10^{6}\times$ at filter level for LeNet-5, and up to $10^{4000}\times$ for EfficientLiteNet) while maintaining accuracy comparable to exhaustive search, with benefits that scale with network size. The work demonstrates practical hardware-aware AxC design for edge AI on diverse accelerators.

Abstract

This work presents HAWX, a hardware-aware scalable exploration framework that employs multi-level sensitivity scoring at different DNN abstraction levels (operator, filter, layer, and model) to guide selective integration of heterogeneous AxC blocks. Supported by predictive models for accuracy, power, and area, HAWX accelerates the evaluation of candidate configurations, achieving over 23* speedup in a layer-level search with two candidate approximate blocks and more than (3*106)* speedup at the filter-level search only for LeNet-5, while maintaining accuracy comparable to exhaustive search. Experiments across state-of-the-art DNN benchmarks such as VGG-11, ResNet-18, and EfficientNetLite demonstrate that the efficiency benefits of HAWX scale exponentially with network size. The HAWX hardware-aware search algorithm supports both spatial and temporal accelerator architectures, leveraging either off-the-shelf approximate components or customized designs.

HAWX: A Hardware-Aware FrameWork for Fast and Scalable ApproXimation of DNNs

TL;DR

HAWX tackles the challenge of exploring heterogeneous approximate components in DNN accelerators by using a hardware-aware, multi-level sensitivity scoring framework. It profiles DNNs to compute hawx scores at operation, filter, layer, and model levels and couples them with predictive accuracy, power, and area models to prune the design space and generate Pareto-optimal configurations for both spatial and temporal accelerators. The results show dramatic speedups over exhaustive search (e.g., up to at layer level and over at filter level for LeNet-5, and up to for EfficientLiteNet) while maintaining accuracy comparable to exhaustive search, with benefits that scale with network size. The work demonstrates practical hardware-aware AxC design for edge AI on diverse accelerators.

Abstract

This work presents HAWX, a hardware-aware scalable exploration framework that employs multi-level sensitivity scoring at different DNN abstraction levels (operator, filter, layer, and model) to guide selective integration of heterogeneous AxC blocks. Supported by predictive models for accuracy, power, and area, HAWX accelerates the evaluation of candidate configurations, achieving over 23* speedup in a layer-level search with two candidate approximate blocks and more than (3*106)* speedup at the filter-level search only for LeNet-5, while maintaining accuracy comparable to exhaustive search. Experiments across state-of-the-art DNN benchmarks such as VGG-11, ResNet-18, and EfficientNetLite demonstrate that the efficiency benefits of HAWX scale exponentially with network size. The HAWX hardware-aware search algorithm supports both spatial and temporal accelerator architectures, leveraging either off-the-shelf approximate components or customized designs.
Paper Structure (11 sections, 9 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 11 sections, 9 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: A high-level flow of the proposed methodology
  • Figure 2: Activation distribution of LeNet-5 on representative inputs
  • Figure 3: hawx(o) distribution in LeNet-5
  • Figure 4: Comparison of Accuracy (blue) and $hawx(model)$ (red) across DNNs.
  • Figure 5: Trade-off between accuracy, power, and area for dataflow and FGPU architectures.