Table of Contents
Fetching ...

Automated Dominative Subspace Mining for Efficient Neural Architecture Search

Yaofo Chen, Yong Guo, Daihai Liao, Fanbing Lv, Hengjie Song, James Tin-Yau Kwok, Mingkui Tan

TL;DR

A novel Neural Architecture Search method via Dominative Subspace Mining (DSM-NAS) that finds promising architectures in automatically mined subspaces and reduces the search cost but also discovers better architectures than state-of-the-art methods in various benchmark search spaces.

Abstract

Neural Architecture Search (NAS) aims to automatically find effective architectures within a predefined search space. However, the search space is often extremely large. As a result, directly searching in such a large search space is non-trivial and also very time-consuming. To address the above issues, in each search step, we seek to limit the search space to a small but effective subspace to boost both the search performance and search efficiency. To this end, we propose a novel Neural Architecture Search method via Dominative Subspace Mining (DSM-NAS) that finds promising architectures in automatically mined subspaces. Specifically, we first perform a global search, i.e ., dominative subspace mining, to find a good subspace from a set of candidates. Then, we perform a local search within the mined subspace to find effective architectures. More critically, we further boost search performance by taking well-designed/ searched architectures to initialize candidate subspaces. Experimental results demonstrate that DSM-NAS not only reduces the search cost but also discovers better architectures than state-of-the-art methods in various benchmark search spaces.

Automated Dominative Subspace Mining for Efficient Neural Architecture Search

TL;DR

A novel Neural Architecture Search method via Dominative Subspace Mining (DSM-NAS) that finds promising architectures in automatically mined subspaces and reduces the search cost but also discovers better architectures than state-of-the-art methods in various benchmark search spaces.

Abstract

Neural Architecture Search (NAS) aims to automatically find effective architectures within a predefined search space. However, the search space is often extremely large. As a result, directly searching in such a large search space is non-trivial and also very time-consuming. To address the above issues, in each search step, we seek to limit the search space to a small but effective subspace to boost both the search performance and search efficiency. To this end, we propose a novel Neural Architecture Search method via Dominative Subspace Mining (DSM-NAS) that finds promising architectures in automatically mined subspaces. Specifically, we first perform a global search, i.e ., dominative subspace mining, to find a good subspace from a set of candidates. Then, we perform a local search within the mined subspace to find effective architectures. More critically, we further boost search performance by taking well-designed/ searched architectures to initialize candidate subspaces. Experimental results demonstrate that DSM-NAS not only reduces the search cost but also discovers better architectures than state-of-the-art methods in various benchmark search spaces.
Paper Structure (26 sections, 4 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 26 sections, 4 equations, 9 figures, 4 tables, 1 algorithm.

Figures (9)

  • Figure 1: An illustration of the search process. We find promising architectures in a two-step search manner: 1) we perform global search to mine/find a dominative subspace from a set of candidates; 2) we move the focus to the subspace and conduct a local search for effective architectures within it. Then, we update the candidate subspace with the better searched architecture.
  • Figure 2: An overview of the proposed DSM-NAS. We build a set of subspaces $\{\Omega_{\alpha_i}\}_{i=1}^K$ centered on randomly sampled candidate architectures $\{\alpha_i\}_{i=1}^K$ and construct a subspace graph ${\mathcal{G}}$ to model the relationships among these subspaces. By taking ${\mathcal{G}}$ as the input, the controller first mines/finds a dominative subspace $\Omega_\alpha \sim \pi_{G}(\cdot | {\mathcal{G}}; \theta_{G})$ via global search and then predicts an architecture modification $\Delta \alpha \sim \pi_{L}(\cdot | \Omega_{\alpha}; \theta_{L})$ via local search. Next, we update the candidate architecture $\alpha$ with the resultant architecture $\beta=\alpha \oplus \Delta \alpha$ if $\beta$ has better performance than $\alpha$ (i.e., $R(\beta | \alpha) > 0$).
  • Figure 3: An illustration of the architecture representation method and calculation of the architecture distance. We represent architecture as a string, in which each item denotes an operation (e.g., convolution). For example, '3', '5' and '7' denote $3\times3$, $5\times5$ and $7\times7$ convolution, respectively.
  • Figure 4: The architecture searched by DSM-NAS in MobileNet-like search space.
  • Figure 5: The architecture searched by DSM-NAS+ in MobileNet-like search space.
  • ...and 4 more figures