Table of Contents
Fetching ...

Robust NAS under adversarial training: benchmark, theory, and beyond

Yongtao Wu, Fanghui Liu, Carl-Johann Simon-Gabriel, Grigorios G Chrysos, Volkan Cevher

TL;DR

This work tackles the need for robust neural architecture search by (i) releasing NAS-RobBench-201, a benchmark that evaluates 6466 NAS-Bench-201 architectures under adversarial training across CIFAR-10/100 and ImageNet-16-120, with robust and clean accuracies, and (ii) developing an NTK-based generalization theory for multi-objective NAS that jointly accounts for standard and adversarial objectives. The theory shows that clean accuracy is governed by a combination of the clean NTK and a robust NTK, while robust accuracy depends on a robust NTK and its twice-perturbed variant, and provides a lower bound on the minimum eigenvalue to guarantee generalization. Empirically, the work demonstrates that robust NTK metrics correlate more strongly with robustness than traditional NTK, and that robust benchmarks yield different architecture rankings than standard benchmarks, underscoring the need for adversarially trained NAS evaluations. The dataset and theory together offer reproducible benchmarks and a theoretical foundation to guide the design of robust NAS algorithms, with practical impact on how researchers assess and pursue robustness in architecture search.

Abstract

Recent developments in neural architecture search (NAS) emphasize the significance of considering robust architectures against malicious data. However, there is a notable absence of benchmark evaluations and theoretical guarantees for searching these robust architectures, especially when adversarial training is considered. In this work, we aim to address these two challenges, making twofold contributions. First, we release a comprehensive data set that encompasses both clean accuracy and robust accuracy for a vast array of adversarially trained networks from the NAS-Bench-201 search space on image datasets. Then, leveraging the neural tangent kernel (NTK) tool from deep learning theory, we establish a generalization theory for searching architecture in terms of clean accuracy and robust accuracy under multi-objective adversarial training. We firmly believe that our benchmark and theoretical insights will significantly benefit the NAS community through reliable reproducibility, efficient assessment, and theoretical foundation, particularly in the pursuit of robust architectures.

Robust NAS under adversarial training: benchmark, theory, and beyond

TL;DR

This work tackles the need for robust neural architecture search by (i) releasing NAS-RobBench-201, a benchmark that evaluates 6466 NAS-Bench-201 architectures under adversarial training across CIFAR-10/100 and ImageNet-16-120, with robust and clean accuracies, and (ii) developing an NTK-based generalization theory for multi-objective NAS that jointly accounts for standard and adversarial objectives. The theory shows that clean accuracy is governed by a combination of the clean NTK and a robust NTK, while robust accuracy depends on a robust NTK and its twice-perturbed variant, and provides a lower bound on the minimum eigenvalue to guarantee generalization. Empirically, the work demonstrates that robust NTK metrics correlate more strongly with robustness than traditional NTK, and that robust benchmarks yield different architecture rankings than standard benchmarks, underscoring the need for adversarially trained NAS evaluations. The dataset and theory together offer reproducible benchmarks and a theoretical foundation to guide the design of robust NAS algorithms, with practical impact on how researchers assess and pursue robustness in architecture search.

Abstract

Recent developments in neural architecture search (NAS) emphasize the significance of considering robust architectures against malicious data. However, there is a notable absence of benchmark evaluations and theoretical guarantees for searching these robust architectures, especially when adversarial training is considered. In this work, we aim to address these two challenges, making twofold contributions. First, we release a comprehensive data set that encompasses both clean accuracy and robust accuracy for a vast array of adversarially trained networks from the NAS-Bench-201 search space on image datasets. Then, leveraging the neural tangent kernel (NTK) tool from deep learning theory, we establish a generalization theory for searching architecture in terms of clean accuracy and robust accuracy under multi-objective adversarial training. We firmly believe that our benchmark and theoretical insights will significantly benefit the NAS community through reliable reproducibility, efficient assessment, and theoretical foundation, particularly in the pursuit of robust architectures.
Paper Structure (28 sections, 17 theorems, 91 equations, 8 figures, 7 tables, 1 algorithm)

This paper contains 28 sections, 17 theorems, 91 equations, 8 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

Denote the expected clean $0$-$1$ loss as $\mathcal{L}^{\mathrm{clean}}_{0-1}({\bm{W}}) := \mathbb{E}_{({\bm{x}},y) }[\mathbb{1}\left \{ y f({\bm{x}}, \bm W)<0 \right \} ]\,,$ and expected robust $0$-$1$ loss as $\mathcal{L}^{\mathrm{robust}}_{0-1}({\bm{W}}) := \mathbb{E}_{({\bm{x}},y) }[ \mathbb{1} where $\lambda_{\min}(\cdot)$ indicates the minimum eigenvalue of the NTK matrix, the expectation i

Figures (8)

  • Figure 1: Visualization of the NAS-Bench-201 search space. Top left: A neural cell with 4 nodes and 6 edges. Top right: 5 predefined operations that can be selected as the edge in the cell. Bottom: Macro structure of each candidate architecture in the benchmark.
  • Figure 2: Boxplots for both clean and robust accuracy of all $6466$ non-isomorphic architectures in the considered search space. Red line indicates the accuracy of a random guess.
  • Figure 3: (a) Distribution of accuracy on CIFAR-10. The peak in the distribution of clean accuracy is much more notable than that of FGSM and PGD. (b) The architecture ranking on CIFAR-10 sorted by robust metric and clean metric correlate well for lower ranking (see larger x-axis) but there still exists a difference for higher ranking. Both (a) and (b) motivate the NAS for robust architecture in terms of robust accuracy instead of clean accuracy. (c) Architecture ranking of average robust accuracy on 3 datasets, sorted by the average robust accuracy on CIFAR-10. The architectures present similar performance across different datasets, which motivates transferable NAS under adversarial training.
  • Figure 4: The operators of each edge in the top-10 architectures (average robust accuracy) on NAS-RobBench-201. The definition of edge number ($\#1 \sim \#6$) and operators are illustrated in \ref{['fig:nas201bench']}.
  • Figure 5: Spearman coefficient between NTK-scores and various metrics. Labels with $2\rho$ in the x-axis indicate the scores w.r.t the robust twice NTK while labels with $\rho$ indicates the score w.r.t the robust NTK.
  • ...and 3 more figures

Theorems & Definitions (33)

  • Definition 1: $\rho$-Bounded adversary
  • Theorem 1: Generalization bound of FCNN by NAS
  • Corollary 1
  • Lemma 1: Corollary 5.35 in vershynin_2012
  • Lemma 2: Upper bound of spectral norms of initial weight
  • proof : Proof of \ref{['lemma:gaussl2']}
  • Lemma 3: The order of the network output at initialization
  • proof
  • Lemma 4
  • proof
  • ...and 23 more