Efficient NAS with FaDE on Hierarchical Spaces
Simon Neumeyer, Julian Stier, Michael Granitzer
TL;DR
This work tackles neural architecture search in hierarchical, open-ended spaces by introducing FaDE, a fast DARTS-based estimator that derives FaDE-ranks—relative performance indicators for finite regions of a hyper-architecture. These ranks enable a memory-less outer search using a pseudo-gradient, batch-wise approach that scales linearly with depth, avoiding proxy architectures. Empirical results show strong rank correlation (~0.8) between FaDE-ranks and actual performance on CIFAR-10, and demonstrate that FaDE-guided outer searches can improve architectures over iterations compared to random search and Bayesian optimization. The method offers a generalizable framework for open-ended NAS, with potential extensions to richer graph spaces and alternative outer-search strategies.
Abstract
Neural architecture search (NAS) is a challenging problem. Hierarchical search spaces allow for cheap evaluations of neural network sub modules to serve as surrogate for architecture evaluations. Yet, sometimes the hierarchy is too restrictive or the surrogate fails to generalize. We present FaDE which uses differentiable architecture search to obtain relative performance predictions on finite regions of a hierarchical NAS space. The relative nature of these ranks calls for a memory-less, batch-wise outer search algorithm for which we use an evolutionary algorithm with pseudo-gradient descent. FaDE is especially suited on deep hierarchical, respectively multi-cell search spaces, which it can explore by linear instead of exponential cost and therefore eliminates the need for a proxy search space. Our experiments show that firstly, FaDE-ranks on finite regions of the search space correlate with corresponding architecture performances and secondly, the ranks can empower a pseudo-gradient evolutionary search on the complete neural architecture search space.
