Table of Contents
Fetching ...

SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS

Yameng Peng, Andy Song, Haytham M. Fayek, Vic Ciesielski, Xiaojun Chang

TL;DR

The paper tackles the high cost of neural architecture search by introducing Sample-Wise Activation Patterns (SWAP) and the SWAP-Score, a training-free metric that correlates strongly with ground-truth performance across diverse NAS spaces and tasks. By evaluating activation patterns on a per-sample basis, SWAP-Score overcomes limitations of standard activation-pattern proxies and enables an ultra-fast NAS method, SWAP-NAS, when combined with an evolutionary search strategy. Regularisation further improves correlation and adds model-size control, particularly in cell-based search spaces, while maintaining high efficiency on CIFAR-10 and ImageNet. The approach achieves state-of-the-art speed and competitive accuracy, suggesting training-free metrics can reliably guide NAS without extensive training.

Abstract

Training-free metrics (a.k.a. zero-cost proxies) are widely used to avoid resource-intensive neural network training, especially in Neural Architecture Search (NAS). Recent studies show that existing training-free metrics have several limitations, such as limited correlation and poor generalisation across different search spaces and tasks. Hence, we propose Sample-Wise Activation Patterns and its derivative, SWAP-Score, a novel high-performance training-free metric. It measures the expressivity of networks over a batch of input samples. The SWAP-Score is strongly correlated with ground-truth performance across various search spaces and tasks, outperforming 15 existing training-free metrics on NAS-Bench-101/201/301 and TransNAS-Bench-101. The SWAP-Score can be further enhanced by regularisation, which leads to even higher correlations in cell-based search space and enables model size control during the search. For example, Spearman's rank correlation coefficient between regularised SWAP-Score and CIFAR-100 validation accuracies on NAS-Bench-201 networks is 0.90, significantly higher than 0.80 from the second-best metric, NWOT. When integrated with an evolutionary algorithm for NAS, our SWAP-NAS achieves competitive performance on CIFAR-10 and ImageNet in approximately 6 minutes and 9 minutes of GPU time respectively.

SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS

TL;DR

The paper tackles the high cost of neural architecture search by introducing Sample-Wise Activation Patterns (SWAP) and the SWAP-Score, a training-free metric that correlates strongly with ground-truth performance across diverse NAS spaces and tasks. By evaluating activation patterns on a per-sample basis, SWAP-Score overcomes limitations of standard activation-pattern proxies and enables an ultra-fast NAS method, SWAP-NAS, when combined with an evolutionary search strategy. Regularisation further improves correlation and adds model-size control, particularly in cell-based search spaces, while maintaining high efficiency on CIFAR-10 and ImageNet. The approach achieves state-of-the-art speed and competitive accuracy, suggesting training-free metrics can reliably guide NAS without extensive training.

Abstract

Training-free metrics (a.k.a. zero-cost proxies) are widely used to avoid resource-intensive neural network training, especially in Neural Architecture Search (NAS). Recent studies show that existing training-free metrics have several limitations, such as limited correlation and poor generalisation across different search spaces and tasks. Hence, we propose Sample-Wise Activation Patterns and its derivative, SWAP-Score, a novel high-performance training-free metric. It measures the expressivity of networks over a batch of input samples. The SWAP-Score is strongly correlated with ground-truth performance across various search spaces and tasks, outperforming 15 existing training-free metrics on NAS-Bench-101/201/301 and TransNAS-Bench-101. The SWAP-Score can be further enhanced by regularisation, which leads to even higher correlations in cell-based search space and enables model size control during the search. For example, Spearman's rank correlation coefficient between regularised SWAP-Score and CIFAR-100 validation accuracies on NAS-Bench-201 networks is 0.90, significantly higher than 0.80 from the second-best metric, NWOT. When integrated with an evolutionary algorithm for NAS, our SWAP-NAS achieves competitive performance on CIFAR-10 and ImageNet in approximately 6 minutes and 9 minutes of GPU time respectively.
Paper Structure (21 sections, 6 equations, 22 figures, 6 tables, 1 algorithm)

This paper contains 21 sections, 6 equations, 22 figures, 6 tables, 1 algorithm.

Figures (22)

  • Figure 1: Search cost and performance comparison between SWAP-NAS and other SoTA NAS on CIFAR-10. Methods over 1 GPU day are not included. The dot size indicates the model size.
  • Figure 2: Two examples of $\mathbb{A}_{\mathcal{N},\theta}$ with different inputs. Green denotes duplicate patterns.
  • Figure 3: Illustration of $\mathbb{A}_{\mathcal{N},\theta}$ and $\mathbb{\hat{A}}_{\mathcal{N},\theta}$ from a network $\mathcal{N}$. Green denotes the duplicate patterns.
  • Figure 4: Spearman's rank correlation coefficients between TF-metric values and networks' ground-truth performance for 15 existing metrics and our two SWAP-Scores. The rows and columns are sorted based on mean scores of five independent experiments for each metric.
  • Figure 5: Illustration of regularised SWAP-Score's capability on model size control in NAS.
  • ...and 17 more figures

Theorems & Definitions (5)

  • Definition 3.1
  • Definition 3.2: Sample-Wise Activation Patterns
  • Definition 3.3: SWAP-Score $\Psi$
  • Definition 3.4: Regularisation
  • Definition 3.5: Regularised SWAP-Score