Table of Contents
Fetching ...

HW-NAS-Bench:Hardware-Aware Neural Architecture Search Benchmark

Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Yingyan Celine Lin

TL;DR

The paper tackles the bottleneck in hardware-aware NAS by introducing HW-NAS-Bench, the first public benchmark that provides real hardware-cost data for networks in two popular NAS spaces across six devices. It showcases a unified hardware-cost collection pipeline that yields latency and energy measurements (or estimates) for all architectures in NAS-Bench-201 and FBNet, revealing that common proxies like $FLOPs$ and $\#Params$ poorly predict real costs. The analysis demonstrates strong device-dependence of cost and architecture suitability, arguing for device-specific HW-NAS to achieve optimal accuracy-cost trade-offs. Practical demonstrations with ProxylessNAS show non-hardware experts can perform HW-NAS using the benchmark, underscoring its potential to accelerate HW-NAS research and tooling, with code and data publicly available.

Abstract

HardWare-aware Neural Architecture Search (HW-NAS) has recently gained tremendous attention by automating the design of DNNs deployed in more resource-constrained daily life devices. Despite its promising performance, developing optimal HW-NAS solutions can be prohibitively challenging as it requires cross-disciplinary knowledge in the algorithm, micro-architecture, and device-specific compilation. First, to determine the hardware-cost to be incorporated into the NAS process, existing works mostly adopt either pre-collected hardware-cost look-up tables or device-specific hardware-cost models. Both of them limit the development of HW-NAS innovations and impose a barrier-to-entry to non-hardware experts. Second, similar to generic NAS, it can be notoriously difficult to benchmark HW-NAS algorithms due to their significant required computational resources and the differences in adopted search spaces, hyperparameters, and hardware devices. To this end, we develop HW-NAS-Bench, the first public dataset for HW-NAS research which aims to democratize HW-NAS research to non-hardware experts and make HW-NAS research more reproducible and accessible. To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance of all the networks in the search spaces of both NAS-Bench-201 and FBNet, on six hardware devices that fall into three categories (i.e., commercial edge devices, FPGA, and ASIC). Furthermore, we provide a comprehensive analysis of the collected measurements in HW-NAS-Bench to provide insights for HW-NAS research. Finally, we demonstrate exemplary user cases to (1) show that HW-NAS-Bench allows non-hardware experts to perform HW-NAS by simply querying it and (2) verify that dedicated device-specific HW-NAS can indeed lead to optimal accuracy-cost trade-offs. The codes and all collected data are available at https://github.com/RICE-EIC/HW-NAS-Bench.

HW-NAS-Bench:Hardware-Aware Neural Architecture Search Benchmark

TL;DR

The paper tackles the bottleneck in hardware-aware NAS by introducing HW-NAS-Bench, the first public benchmark that provides real hardware-cost data for networks in two popular NAS spaces across six devices. It showcases a unified hardware-cost collection pipeline that yields latency and energy measurements (or estimates) for all architectures in NAS-Bench-201 and FBNet, revealing that common proxies like and poorly predict real costs. The analysis demonstrates strong device-dependence of cost and architecture suitability, arguing for device-specific HW-NAS to achieve optimal accuracy-cost trade-offs. Practical demonstrations with ProxylessNAS show non-hardware experts can perform HW-NAS using the benchmark, underscoring its potential to accelerate HW-NAS research and tooling, with code and data publicly available.

Abstract

HardWare-aware Neural Architecture Search (HW-NAS) has recently gained tremendous attention by automating the design of DNNs deployed in more resource-constrained daily life devices. Despite its promising performance, developing optimal HW-NAS solutions can be prohibitively challenging as it requires cross-disciplinary knowledge in the algorithm, micro-architecture, and device-specific compilation. First, to determine the hardware-cost to be incorporated into the NAS process, existing works mostly adopt either pre-collected hardware-cost look-up tables or device-specific hardware-cost models. Both of them limit the development of HW-NAS innovations and impose a barrier-to-entry to non-hardware experts. Second, similar to generic NAS, it can be notoriously difficult to benchmark HW-NAS algorithms due to their significant required computational resources and the differences in adopted search spaces, hyperparameters, and hardware devices. To this end, we develop HW-NAS-Bench, the first public dataset for HW-NAS research which aims to democratize HW-NAS research to non-hardware experts and make HW-NAS research more reproducible and accessible. To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance of all the networks in the search spaces of both NAS-Bench-201 and FBNet, on six hardware devices that fall into three categories (i.e., commercial edge devices, FPGA, and ASIC). Furthermore, we provide a comprehensive analysis of the collected measurements in HW-NAS-Bench to provide insights for HW-NAS research. Finally, we demonstrate exemplary user cases to (1) show that HW-NAS-Bench allows non-hardware experts to perform HW-NAS by simply querying it and (2) verify that dedicated device-specific HW-NAS can indeed lead to optimal accuracy-cost trade-offs. The codes and all collected data are available at https://github.com/RICE-EIC/HW-NAS-Bench.

Paper Structure

This paper contains 24 sections, 6 figures, 8 tables.

Figures (6)

  • Figure 1: An illustration of our proposed HW-NAS-Bench
  • Figure 2: Illustrating the hardware-cost collection pipeline applicable to various hardware devices.
  • Figure 3: Kendall Rank Correlation Coefficient between real-measured/estimated hardware-cost in different devices considering the NAS-Bench-201 search space.
  • Figure 4: Kendall Rank Correlation Coefficient between real-measured/estimated hardware-cost in different devices considering the FBNet search space.
  • Figure 5: Accuracy vs. hardware-cost on different devices considering NAS-Bench-201, where points in red denote the architectures with the optimal trade-offs between "accuracy on ImageNet16-120 vs. latency measured on Edge GPU", of which the architectures represent the ground truth of HW-NAS targeting Edge GPUs.
  • ...and 1 more figures