Table of Contents
Fetching ...

Analyzing the Expected Hitting Time of Evolutionary Computation-based Neural Architecture Search Algorithms

Zeqiong Lv, Chao Qian, Gary G. Yen, Yanan Sun

TL;DR

This work is the first attempt to establish a theoretical foundation for ENAS algorithms by integrating theory and experiment for estimating the expected hitting time (EHT) of ENAS algorithms, which includes common configuration, search space partition, transition probability estimation, population distribution fitting, and hitting time analysis.

Abstract

Evolutionary computation-based neural architecture search (ENAS) is a popular technique for automating architecture design of deep neural networks. Despite its groundbreaking applications, there is no theoretical study for ENAS. The expected hitting time (EHT) is one of the most important theoretical issues, since it implies the average computational time complexity. This paper proposes a general method by integrating theory and experiment for estimating the EHT of ENAS algorithms, which includes common configuration, search space partition, transition probability estimation, population distribution fitting, and hitting time analysis. By exploiting the proposed method, we consider the ($λ$+$λ$)-ENAS algorithms with different mutation operators and estimate the lower bounds of the EHT. Furthermore, we study the EHT on the NAS-Bench-101 problem, and the results demonstrate the validity of the proposed method. To the best of our knowledge, this work is the first attempt to establish a theoretical foundation for ENAS algorithms.

Analyzing the Expected Hitting Time of Evolutionary Computation-based Neural Architecture Search Algorithms

TL;DR

This work is the first attempt to establish a theoretical foundation for ENAS algorithms by integrating theory and experiment for estimating the expected hitting time (EHT) of ENAS algorithms, which includes common configuration, search space partition, transition probability estimation, population distribution fitting, and hitting time analysis.

Abstract

Evolutionary computation-based neural architecture search (ENAS) is a popular technique for automating architecture design of deep neural networks. Despite its groundbreaking applications, there is no theoretical study for ENAS. The expected hitting time (EHT) is one of the most important theoretical issues, since it implies the average computational time complexity. This paper proposes a general method by integrating theory and experiment for estimating the EHT of ENAS algorithms, which includes common configuration, search space partition, transition probability estimation, population distribution fitting, and hitting time analysis. By exploiting the proposed method, we consider the (+)-ENAS algorithms with different mutation operators and estimate the lower bounds of the EHT. Furthermore, we study the EHT on the NAS-Bench-101 problem, and the results demonstrate the validity of the proposed method. To the best of our knowledge, this work is the first attempt to establish a theoretical foundation for ENAS algorithms.
Paper Structure (16 sections, 9 theorems, 27 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 16 sections, 9 theorems, 27 equations, 4 figures, 1 table, 1 algorithm.

Key Result

Lemma 1

Given a Markov chain $\left\lbrace {\xi_t}\right\rbrace _{t=0}^{+\infty}$ converges to $\chi^*$ where initial state $\xi_0$ satisfies $P(\xi_0\in \chi-\chi^*)>0$, and an initial distance $d(\xi_0)$, for each generation $t$, there is an average drift $\bar{\Delta}_t$. If $0<c_2 \leq \bar{\Delta}_t \l

Figures (4)

  • Figure 1: Three architectures for three kinds of search spaces, i.e., (a) an architecture in layer-based search space, (b) an architecture in block-based search space, and (c) an architecture in cell-based search space.
  • Figure 2: The architecture and encoding schematic of CNN.
  • Figure 3: The visualization results of the sampling data (red dots) and fitting surfaces (colored surfaces) of $\pi_t$ for different ENAS algorithms, where the HM distance represents the Hamming distance of the population. (a) The fitting result (corresponding to $Z_1$) of $\pi_t$ for ENAS algorithm with Mutation#1. (b) The fitting result (corresponding to $Z_2$) of $\pi_t$ for ENAS algorithm with Mutation#2. (c) The fitting result (corresponding to $Z_3$) of $\pi_t$ for ENAS algorithm with Mutation#3(q=2). (d) The fitting result (corresponding to $Z_4$) of $\pi_t$ for ENAS algorithm with Mutation#4.
  • Figure 4: The theoretical and experimental running times (iterations) of the ($\lambda$+$\lambda$)-ENAS algorithms using mutation operators Mutation#1, Mutation#2, Mutation#3, and Mutation#4, respectively, where the parameters of search space are set to $n=26$ and $L=2$. (a) The lower bounds of EHT of ($\lambda$+$\lambda$)-ENAS algorithms with population sizes $\lambda$ ranging from 1 to 100. (b) The average iterations count of ($\lambda$+$\lambda$)-ENAS algorithms with population sizes $\lambda$ ranging from 1 to 100 with step size four.

Theorems & Definitions (9)

  • Lemma 1: he2016average
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4