Table of Contents
Fetching ...

A Comprehensive Survey on Hardware-Aware Neural Architecture Search

Hadjer Benmeziane, Kaoutar El Maghraoui, Hamza Ouarnoughi, Smail Niar, Martin Wistuba, Naigang Wang

TL;DR

This survey addresses the challenge of deploying neural architectures on diverse, resource-constrained hardware by framing hardware-aware NAS (HW-NAS) as a multi-objective design problem. It introduces a four-dimensional taxonomy (search space, search strategy, acceleration technique, hardware cost estimation) and provides a comprehensive review of HW-NAS search spaces, problem formulations, and cost models. Key contributions include clarifying HW-NAS design choices, comparing optimization approaches (RL, EA, gradient-based, and Bayesian methods), and cataloging hardware cost estimation techniques and benchmarks. The paper highlights industrial adoption, benchmarking limitations, and future directions toward robust, transferable HW-NAS and hardware-software co-design. Overall, it aims to pave the way for practical HW-NAS methods that democratize efficient DL deployment across varied devices and platforms.

Abstract

Neural Architecture Search (NAS) methods have been growing in popularity. These techniques have been fundamental to automate and speed up the time consuming and error-prone process of synthesizing novel Deep Learning (DL) architectures. NAS has been extensively studied in the past few years. Arguably their most significant impact has been in image classification and object detection tasks where the state of the art results have been obtained. Despite the significant success achieved to date, applying NAS to real-world problems still poses significant challenges and is not widely practical. In general, the synthesized Convolution Neural Network (CNN) architectures are too complex to be deployed in resource-limited platforms, such as IoT, mobile, and embedded systems. One solution growing in popularity is to use multi-objective optimization algorithms in the NAS search strategy by taking into account execution latency, energy consumption, memory footprint, etc. This kind of NAS, called hardware-aware NAS (HW-NAS), makes searching the most efficient architecture more complicated and opens several questions. In this survey, we provide a detailed review of existing HW-NAS research and categorize them according to four key dimensions: the search space, the search strategy, the acceleration technique, and the hardware cost estimation strategies. We further discuss the challenges and limitations of existing approaches and potential future directions. This is the first survey paper focusing on hardware-aware NAS. We hope it serves as a valuable reference for the various techniques and algorithms discussed and paves the road for future research towards hardware-aware NAS.

A Comprehensive Survey on Hardware-Aware Neural Architecture Search

TL;DR

This survey addresses the challenge of deploying neural architectures on diverse, resource-constrained hardware by framing hardware-aware NAS (HW-NAS) as a multi-objective design problem. It introduces a four-dimensional taxonomy (search space, search strategy, acceleration technique, hardware cost estimation) and provides a comprehensive review of HW-NAS search spaces, problem formulations, and cost models. Key contributions include clarifying HW-NAS design choices, comparing optimization approaches (RL, EA, gradient-based, and Bayesian methods), and cataloging hardware cost estimation techniques and benchmarks. The paper highlights industrial adoption, benchmarking limitations, and future directions toward robust, transferable HW-NAS and hardware-software co-design. Overall, it aims to pave the way for practical HW-NAS methods that democratize efficient DL deployment across varied devices and platforms.

Abstract

Neural Architecture Search (NAS) methods have been growing in popularity. These techniques have been fundamental to automate and speed up the time consuming and error-prone process of synthesizing novel Deep Learning (DL) architectures. NAS has been extensively studied in the past few years. Arguably their most significant impact has been in image classification and object detection tasks where the state of the art results have been obtained. Despite the significant success achieved to date, applying NAS to real-world problems still poses significant challenges and is not widely practical. In general, the synthesized Convolution Neural Network (CNN) architectures are too complex to be deployed in resource-limited platforms, such as IoT, mobile, and embedded systems. One solution growing in popularity is to use multi-objective optimization algorithms in the NAS search strategy by taking into account execution latency, energy consumption, memory footprint, etc. This kind of NAS, called hardware-aware NAS (HW-NAS), makes searching the most efficient architecture more complicated and opens several questions. In this survey, we provide a detailed review of existing HW-NAS research and categorize them according to four key dimensions: the search space, the search strategy, the acceleration technique, and the hardware cost estimation strategies. We further discuss the challenges and limitations of existing approaches and potential future directions. This is the first survey paper focusing on hardware-aware NAS. We hope it serves as a valuable reference for the various techniques and algorithms discussed and paves the road for future research towards hardware-aware NAS.

Paper Structure

This paper contains 56 sections, 7 equations, 16 figures, 9 tables, 1 algorithm.

Figures (16)

  • Figure 1: Generic CNN architecture. For each layer an operator is chosen among a pre-defined list (convolution, dilated convolution, depthwise convolution, maxpooling, batch_normalization...)
  • Figure 2: Accuracy of various CNN models on ImageNet for Image Classification task with the number of parameters. Inspired by 9043731
  • Figure 3: Overview of conventional NAS components
  • Figure 4: Number of papers describing HW-NAS by Dec 2020. The top 5 conferences and journals are: NeurIPS, ECCV, IEEE Transactions on Pattern Analysis and Machine Intelligence, IJCNN, and MICCAI.
  • Figure 5: Illustration of sparsification. (a) Weight Pruning. (b) Neuron Pruning. Gray lines correspond to pruned vertices (i.e. weights) and white nodes correspond to pruned neurons. Source 9043731
  • ...and 11 more figures