Robustifying and Boosting Training-Free Neural Architecture Search

Zhenfeng He; Yao Shu; Zhongxiang Dai; Bryan Kian Hsiang Low

Robustifying and Boosting Training-Free Neural Architecture Search

Zhenfeng He, Yao Shu, Zhongxiang Dai, Bryan Kian Hsiang Low

TL;DR

The robustifying and boosting training-free NAS (RoBoT) algorithm is proposed which employs the optimized combination of existing training-free metrics explored from Bayesian optimization to develop a robust and consistently better-performing metric on diverse tasks, and applies greedy search on the newly developed metric to bridge the gap.

Abstract

Neural architecture search (NAS) has become a key component of AutoML and a standard tool to automate the design of deep neural networks. Recently, training-free NAS as an emerging paradigm has successfully reduced the search costs of standard training-based NAS by estimating the true architecture performance with only training-free metrics. Nevertheless, the estimation ability of these metrics typically varies across different tasks, making it challenging to achieve robust and consistently good search performance on diverse tasks with only a single training-free metric. Meanwhile, the estimation gap between training-free metrics and the true architecture performances limits training-free NAS to achieve superior performance. To address these challenges, we propose the robustifying and boosting training-free NAS (RoBoT) algorithm which (a) employs the optimized combination of existing training-free metrics explored from Bayesian optimization to develop a robust and consistently better-performing metric on diverse tasks, and (b) applies greedy search, i.e., the exploitation, on the newly developed metric to bridge the aforementioned gap and consequently to boost the search performance of standard training-free NAS further. Remarkably, the expected performance of our RoBoT can be theoretically guaranteed, which improves over the existing training-free NAS under mild conditions with additional interesting insights. Our extensive experiments on various NAS benchmark tasks yield substantial empirical evidence to support our theoretical results.

Robustifying and Boosting Training-Free Neural Architecture Search

TL;DR

Abstract

Paper Structure (58 sections, 6 theorems, 14 equations, 10 figures, 9 tables, 2 algorithms)

This paper contains 58 sections, 6 theorems, 14 equations, 10 figures, 9 tables, 2 algorithms.

Introduction
Related Work
Training-free NAS
Hybrid NAS
Observations and Motivations
Our Methodology
Robustifying Training-free NAS Metric
Exploration to Optimize the Robust Estimation Metric
Exploitation to Bridge the Estimation Gap
Discussion and Theoretical Analyses
Expected Performance of RoBoT
Discussion and Analysis on RoBoT
Influence of $T_0$
Influence of $T$
Experiments
...and 43 more sections

Key Result

Proposition 1

Suppose there are two estimation metrics ${\mathcal{M}}_1$, ${\mathcal{M}}_2$, and objective evaluation metric $f$. If $\mathrm{Cov}[{\mathcal{M}}_1,{\mathcal{M}}_2] \neq \frac{\mathrm{Cov}[{\mathcal{M}}_2,f]\mathrm{Var}[{\mathcal{M}}_1]}{\mathrm{Cov}[{\mathcal{M}}_1,f]}$ and $\mathrm{Cov}[{\mathcal where $\mathrm{Cov}[X, Y], \rho_{\text{Pearson}}(X, Y)$ are the covariance and Pearson's correlatio

Figures (10)

Figure 1: (a) Training-free Metric 1 and 2's scores for four architectures for a specific task, a higher value indicates more recommended. The true architecture performance is ranked as $1 > 2 > 3 > 4$. (b) The highest-scoring architecture selected based on the weight vector of Metric 1 and Metric 2. Each region of the weight vector selects the same architecture, as represented by its color.
Figure 2: Comparison of NAS algorithms in NAS-Bench-201 and TransNAS-Bench-101-micro regarding the number of searched architectures. RoBoT and HNAS are reported with the mean and standard error of 10 runs, and 50 runs for RS, REA and REINFORCE.
Figure 3: Comparison between RoBoT and other NAS baselines in NAS-Bench-201 regarding the number of searched architectures. Note that RoBoT and HNAS are reported with the mean and standard error of 10 independent searches, while RS, REA, and REINFORCE are reported with 50 independent searches.
Figure 4: Similarity and correlation among the varying optimized linear combination weights on 4 tasks of TransNAS-Bench-101-micro.
Figure 5: Comparison between different values of $T_0$ and RoBoT on 5 tasks in TransNAS-Bench-101-micro regarding the number of searched architectures. Note that all methods are reported with the mean and standard error of 10 independent searches.
...and 5 more figures

Theorems & Definitions (8)

Proposition 1
Theorem 1
Theorem 2
Lemma 1
proof
Theorem 3: Corollary 18 in linear-partial-monioring
Theorem 4
proof

Robustifying and Boosting Training-Free Neural Architecture Search

TL;DR

Abstract

Robustifying and Boosting Training-Free Neural Architecture Search

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (8)