HAWX: A Hardware-Aware FrameWork for Fast and Scalable ApproXimation of DNNs
Samira Nazari, Mohammad Saeed Almasi, Mahdi Taheri, Ali Azarpeyvand, Ali Mokhtari, Ali Mahani, Christian Herglotz
TL;DR
HAWX tackles the challenge of exploring heterogeneous approximate components in DNN accelerators by using a hardware-aware, multi-level sensitivity scoring framework. It profiles DNNs to compute hawx scores at operation, filter, layer, and model levels and couples them with predictive accuracy, power, and area models to prune the design space and generate Pareto-optimal configurations for both spatial and temporal accelerators. The results show dramatic speedups over exhaustive search (e.g., up to $23\times$ at layer level and over $10^{6}\times$ at filter level for LeNet-5, and up to $10^{4000}\times$ for EfficientLiteNet) while maintaining accuracy comparable to exhaustive search, with benefits that scale with network size. The work demonstrates practical hardware-aware AxC design for edge AI on diverse accelerators.
Abstract
This work presents HAWX, a hardware-aware scalable exploration framework that employs multi-level sensitivity scoring at different DNN abstraction levels (operator, filter, layer, and model) to guide selective integration of heterogeneous AxC blocks. Supported by predictive models for accuracy, power, and area, HAWX accelerates the evaluation of candidate configurations, achieving over 23* speedup in a layer-level search with two candidate approximate blocks and more than (3*106)* speedup at the filter-level search only for LeNet-5, while maintaining accuracy comparable to exhaustive search. Experiments across state-of-the-art DNN benchmarks such as VGG-11, ResNet-18, and EfficientNetLite demonstrate that the efficiency benefits of HAWX scale exponentially with network size. The HAWX hardware-aware search algorithm supports both spatial and temporal accelerator architectures, leveraging either off-the-shelf approximate components or customized designs.
