Table of Contents
Fetching ...

Robustifying and Boosting Training-Free Neural Architecture Search

Zhenfeng He, Yao Shu, Zhongxiang Dai, Bryan Kian Hsiang Low

TL;DR

The robustifying and boosting training-free NAS (RoBoT) algorithm is proposed which employs the optimized combination of existing training-free metrics explored from Bayesian optimization to develop a robust and consistently better-performing metric on diverse tasks, and applies greedy search on the newly developed metric to bridge the gap.

Abstract

Neural architecture search (NAS) has become a key component of AutoML and a standard tool to automate the design of deep neural networks. Recently, training-free NAS as an emerging paradigm has successfully reduced the search costs of standard training-based NAS by estimating the true architecture performance with only training-free metrics. Nevertheless, the estimation ability of these metrics typically varies across different tasks, making it challenging to achieve robust and consistently good search performance on diverse tasks with only a single training-free metric. Meanwhile, the estimation gap between training-free metrics and the true architecture performances limits training-free NAS to achieve superior performance. To address these challenges, we propose the robustifying and boosting training-free NAS (RoBoT) algorithm which (a) employs the optimized combination of existing training-free metrics explored from Bayesian optimization to develop a robust and consistently better-performing metric on diverse tasks, and (b) applies greedy search, i.e., the exploitation, on the newly developed metric to bridge the aforementioned gap and consequently to boost the search performance of standard training-free NAS further. Remarkably, the expected performance of our RoBoT can be theoretically guaranteed, which improves over the existing training-free NAS under mild conditions with additional interesting insights. Our extensive experiments on various NAS benchmark tasks yield substantial empirical evidence to support our theoretical results.

Robustifying and Boosting Training-Free Neural Architecture Search

TL;DR

The robustifying and boosting training-free NAS (RoBoT) algorithm is proposed which employs the optimized combination of existing training-free metrics explored from Bayesian optimization to develop a robust and consistently better-performing metric on diverse tasks, and applies greedy search on the newly developed metric to bridge the gap.

Abstract

Neural architecture search (NAS) has become a key component of AutoML and a standard tool to automate the design of deep neural networks. Recently, training-free NAS as an emerging paradigm has successfully reduced the search costs of standard training-based NAS by estimating the true architecture performance with only training-free metrics. Nevertheless, the estimation ability of these metrics typically varies across different tasks, making it challenging to achieve robust and consistently good search performance on diverse tasks with only a single training-free metric. Meanwhile, the estimation gap between training-free metrics and the true architecture performances limits training-free NAS to achieve superior performance. To address these challenges, we propose the robustifying and boosting training-free NAS (RoBoT) algorithm which (a) employs the optimized combination of existing training-free metrics explored from Bayesian optimization to develop a robust and consistently better-performing metric on diverse tasks, and (b) applies greedy search, i.e., the exploitation, on the newly developed metric to bridge the aforementioned gap and consequently to boost the search performance of standard training-free NAS further. Remarkably, the expected performance of our RoBoT can be theoretically guaranteed, which improves over the existing training-free NAS under mild conditions with additional interesting insights. Our extensive experiments on various NAS benchmark tasks yield substantial empirical evidence to support our theoretical results.
Paper Structure (58 sections, 6 theorems, 14 equations, 10 figures, 9 tables, 2 algorithms)

This paper contains 58 sections, 6 theorems, 14 equations, 10 figures, 9 tables, 2 algorithms.

Key Result

Proposition 1

Suppose there are two estimation metrics ${\mathcal{M}}_1$, ${\mathcal{M}}_2$, and objective evaluation metric $f$. If $\mathrm{Cov}[{\mathcal{M}}_1,{\mathcal{M}}_2] \neq \frac{\mathrm{Cov}[{\mathcal{M}}_2,f]\mathrm{Var}[{\mathcal{M}}_1]}{\mathrm{Cov}[{\mathcal{M}}_1,f]}$ and $\mathrm{Cov}[{\mathcal where $\mathrm{Cov}[X, Y], \rho_{\text{Pearson}}(X, Y)$ are the covariance and Pearson's correlatio

Figures (10)

  • Figure 1: (a) Training-free Metric 1 and 2's scores for four architectures for a specific task, a higher value indicates more recommended. The true architecture performance is ranked as $1 > 2 > 3 > 4$. (b) The highest-scoring architecture selected based on the weight vector of Metric 1 and Metric 2. Each region of the weight vector selects the same architecture, as represented by its color.
  • Figure 2: Comparison of NAS algorithms in NAS-Bench-201 and TransNAS-Bench-101-micro regarding the number of searched architectures. RoBoT and HNAS are reported with the mean and standard error of 10 runs, and 50 runs for RS, REA and REINFORCE.
  • Figure 3: Comparison between RoBoT and other NAS baselines in NAS-Bench-201 regarding the number of searched architectures. Note that RoBoT and HNAS are reported with the mean and standard error of 10 independent searches, while RS, REA, and REINFORCE are reported with 50 independent searches.
  • Figure 4: Similarity and correlation among the varying optimized linear combination weights on 4 tasks of TransNAS-Bench-101-micro.
  • Figure 5: Comparison between different values of $T_0$ and RoBoT on 5 tasks in TransNAS-Bench-101-micro regarding the number of searched architectures. Note that all methods are reported with the mean and standard error of 10 independent searches.
  • ...and 5 more figures

Theorems & Definitions (8)

  • Proposition 1
  • Theorem 1
  • Theorem 2
  • Lemma 1
  • proof
  • Theorem 3: Corollary 18 in linear-partial-monioring
  • Theorem 4
  • proof