Automated Search-Space Generation Neural Architecture Search

Tianyi Chen; Luming Liang; Tianyu Ding; Ilya Zharkov

Automated Search-Space Generation Neural Architecture Search

Tianyi Chen, Luming Liang, Tianyu Ding, Ilya Zharkov

TL;DR

This work tackles the bottleneck of hand-crafted NAS search spaces by introducing ASGNAS, an automated system that generates a general DNN search space, trains once, and constructs compact sub-networks with high performance. Central to ASGNAS is the Hierarchical Half-Space Projected Gradient (H2SPG), a novel optimizer that handles hierarchical structured sparsity to prune removal structures while preserving network validity. The approach combines a graph-based automated search-space generation with a hierarchical sparsity optimization and automated sub-network construction, demonstrated across multiple architectures (e.g., RegNet, StackedUnets, DARTS, SuperResNet) and datasets (CIFAR10, Fashion-MNIST, ImageNet, STL-10, SVNH). Results show sub-networks that match or exceed full-network performance with substantially reduced parameters and FLOPs, underscoring the practical potential for scalable, automated NAS in diverse DNNs. The accompanying library and methodology pave the way for broader adoption of automated NAS with general applicability, despite current limitations like ONNX dependence.

Abstract

To search an optimal sub-network within a general deep neural network (DNN), existing neural architecture search (NAS) methods typically rely on handcrafting a search space beforehand. Such requirements make it challenging to extend them onto general scenarios without significant human expertise and manual intervention. To overcome the limitations, we propose Automated Search-Space Generation Neural Architecture Search (ASGNAS), perhaps the first automated system to train general DNNs that cover all candidate connections and operations and produce high-performing sub-networks in the one shot manner. Technologically, ASGNAS delivers three noticeable contributions to minimize human efforts: (i) automated search space generation for general DNNs; (ii) a Hierarchical Half-Space Projected Gradient (H2SPG) that leverages the hierarchy and dependency within generated search space to ensure the network validity during optimization, and reliably produces a solution with both high performance and hierarchical group sparsity; and (iii) automated sub-network construction upon the H2SPG solution. Numerically, we demonstrate the effectiveness of ASGNAS on a variety of general DNNs, including RegNet, StackedUnets, SuperResNet, and DARTS, over benchmark datasets such as CIFAR10, Fashion-MNIST, ImageNet, STL-10 , and SVNH. The sub-networks computed by ASGNAS achieve competitive even superior performance compared to the starting full DNNs and other state-of-the-arts. The library will be released at https://github.com/tianyic/only_train_once.

Automated Search-Space Generation Neural Architecture Search

TL;DR

Abstract

Paper Structure (27 sections, 1 equation, 13 figures, 4 tables, 3 algorithms)

This paper contains 27 sections, 1 equation, 13 figures, 4 tables, 3 algorithms.

Introduction
Related Work
Neural Architecture Search (NAS).
Hierarchical Structured Sparsity Optimization.
ASGNAS
Automated Search Space Generation
Hierarchical Half-Space Projected Gradient (H2SPG)
Outline of H2SPG.
Automated Sub-Network Construction.
Numerical Experiments
DemoNet on Fashion-MNIST.
StackedUnets on SVNH.
DARTS (8-Cells) on STL-10.
SuperResNet on CIFAR10.
DARTS (14-Cells) on ImageNet.
...and 12 more sections

Figures (13)

Figure 1: Overview of ASGNAS. Given a general DNN, ASGNAS first automatically generates a search space, then employs H2SPG to identify redundant removal structures and train the important counterparts to high-performance, finally constructs a compact and high-performing sub-network.
Figure 2: Automated Search Space Generation. (a) The DemoNet to be trained and searched; (b) the constructed segment graph; and (c) the trainable variable partition, where $\mathcal{G}_s$ represents the groups corresponding to removal structures. $\bm{\widehat{\mathcal{K}}}_i$ and $\bm{b}_i$ are the flatten filter matrix and bias vector for Conv-i, respectively. $\bm{\gamma}_i$ and $\bm{\beta}_i$ are the weight and bias vectors for BN-i. $\bm{{\mathcal{W}}}_i$ is the weight matrix for Linear-i. The columns of $\bm{\widehat{\mathcal{K}}}_6$ are marked in accordance to its incoming segments.
Figure 3: Check validness of redundant candidates. Target group sparsity $K=3$. Conv7-BN7 has smaller salience score than Conv2-BN2. Dotted vertices are marked as redundant candidates.
Figure 4: Redundant removal structures idenfitications and sub-network construction.
Figure 5: StackedUnets illustrations drawn by ASGNAS.
...and 8 more figures

Automated Search-Space Generation Neural Architecture Search

TL;DR

Abstract

Automated Search-Space Generation Neural Architecture Search

Authors

TL;DR

Abstract

Table of Contents

Figures (13)