Table of Contents
Fetching ...

Accel-NASBench: Sustainable Benchmarking for Accelerator-Aware NAS

Afzal Ahmad, Linfeng Du, Zhiyao Xie, Wei Zhang

TL;DR

Accel-NASBench tackles the sustainability issue of neural architecture search benchmarks by learning training proxies that preserve architecture rankings while dramatically reducing training cost. It constructs the first zero-cost NAS benchmark for ImageNet2012, augmented with end-to-end on-device throughput/latency data for GPUs, TPUs, and FPGAs, enabling accelerator-aware evaluation. The method optimizes proxy fidelity via Kendall's tau under a training-time constraint, and demonstrates high predictive quality with XGBoost surrogates and competitive bi-objective search results, closely matching true-search performance at a fraction of the cost. This work provides a practical, scalable framework for realistic, hardware-aware NAS benchmarking and lays groundwork for sustainable large-scale benchmarks.

Abstract

One of the primary challenges impeding the progress of Neural Architecture Search (NAS) is its extensive reliance on exorbitant computational resources. NAS benchmarks aim to simulate runs of NAS experiments at zero cost, remediating the need for extensive compute. However, existing NAS benchmarks use synthetic datasets and model proxies that make simplified assumptions about the characteristics of these datasets and models, leading to unrealistic evaluations. We present a technique that allows searching for training proxies that reduce the cost of benchmark construction by significant margins, making it possible to construct realistic NAS benchmarks for large-scale datasets. Using this technique, we construct an open-source bi-objective NAS benchmark for the ImageNet2012 dataset combined with the on-device performance of accelerators, including GPUs, TPUs, and FPGAs. Through extensive experimentation with various NAS optimizers and hardware platforms, we show that the benchmark is accurate and allows searching for state-of-the-art hardware-aware models at zero cost.

Accel-NASBench: Sustainable Benchmarking for Accelerator-Aware NAS

TL;DR

Accel-NASBench tackles the sustainability issue of neural architecture search benchmarks by learning training proxies that preserve architecture rankings while dramatically reducing training cost. It constructs the first zero-cost NAS benchmark for ImageNet2012, augmented with end-to-end on-device throughput/latency data for GPUs, TPUs, and FPGAs, enabling accelerator-aware evaluation. The method optimizes proxy fidelity via Kendall's tau under a training-time constraint, and demonstrates high predictive quality with XGBoost surrogates and competitive bi-objective search results, closely matching true-search performance at a fraction of the cost. This work provides a practical, scalable framework for realistic, hardware-aware NAS benchmarking and lays groundwork for sustainable large-scale benchmarks.

Abstract

One of the primary challenges impeding the progress of Neural Architecture Search (NAS) is its extensive reliance on exorbitant computational resources. NAS benchmarks aim to simulate runs of NAS experiments at zero cost, remediating the need for extensive compute. However, existing NAS benchmarks use synthetic datasets and model proxies that make simplified assumptions about the characteristics of these datasets and models, leading to unrealistic evaluations. We present a technique that allows searching for training proxies that reduce the cost of benchmark construction by significant margins, making it possible to construct realistic NAS benchmarks for large-scale datasets. Using this technique, we construct an open-source bi-objective NAS benchmark for the ImageNet2012 dataset combined with the on-device performance of accelerators, including GPUs, TPUs, and FPGAs. Through extensive experimentation with various NAS optimizers and hardware platforms, we show that the benchmark is accurate and allows searching for state-of-the-art hardware-aware models at zero cost.
Paper Structure (18 sections, 1 equation, 6 figures, 2 tables)

This paper contains 18 sections, 1 equation, 6 figures, 2 tables.

Figures (6)

  • Figure 1: NAS flow consists of an optimizer sampling architectures from a search space, followed by evaluation in terms of accuracy and on-device performance. A NAS benchmark sidesteps the expensive evaluation phase by using surrogate predictors that offer zero-cost evaluation.
  • Figure 2: (Top) Proposed method to search for proxified training scheme and (bottom) using the searched scheme ($p^{*}$) and hardware accelerators to collect datasets for construction of Accel-NASBench.
  • Figure 3: Validation of $p^{*}$ using 120 random unseen models, trained using both $p^{*}$ and $r$. Architecture rankings are strongly correlated between the two schemes $\tau=0.926$.
  • Figure 4: Search using RL-based bi-objective optimization. Fig (a) shows the pareto-optimal front using simulated search on accuracy-latency objectives. Fig (b)-(f) show the results of accuracy-throughput search on (b) ZCU102 and (c) VCK190 FPGAs, (d) TPUv3, (e) A100 and (f) RTX 3090 GPUs. Also shown in red, magenta, and purple star markers are pareto-optimal solutions hand-picked for evaluation. Legends show their performances predicted by surrogates.
  • Figure 5: Comparison of trajectory of uni-objective search between (a) true and (b) simulated runs using Accel-NASBench.
  • ...and 1 more figures