Efficient Multi-objective Neural Architecture Search via Lamarckian Evolution
Thomas Elsken, Jan Hendrik Metzen, Frank Hutter
TL;DR
Efficient Multi-objective Neural Architecture Search via Lamarckian Evolution introduces LEMONADE, an evolutionary NAS framework that uses Lamarckian inheritance via network morphisms to warm-start offspring, enabling efficient exploration of large architecture spaces under multiple objectives. It distinguishes cheap (parameters, FLOPs) and expensive (validation accuracy) objectives using KDE-guided two-stage sampling to focus resources on promising regions of the Pareto front. The method supports arbitrary search spaces, including full architectures and repeatable cells, and demonstrates competitive results on CIFAR-10 and transferable performance to ImageNet64x64 and mobile ImageNet with substantially less compute than prior NAS methods. The work advances practical automations for model discovery under resource constraints by returning a Pareto set rather than a single optimum.
Abstract
Neural Architecture Search aims at automatically finding neural architectures that are competitive with architectures designed by human experts. While recent approaches have achieved state-of-the-art predictive performance for image recognition, they are problematic under resource constraints for two reasons: (1)the neural architectures found are solely optimized for high predictive performance, without penalizing excessive resource consumption, (2) most architecture search methods require vast computational resources. We address the first shortcoming by proposing LEMONADE, an evolutionary algorithm for multi-objective architecture search that allows approximating the entire Pareto-front of architectures under multiple objectives, such as predictive performance and number of parameters, in a single run of the method. We address the second shortcoming by proposing a Lamarckian inheritance mechanism for LEMONADE which generates children networks that are warmstarted with the predictive performance of their trained parents. This is accomplished by using (approximate) network morphism operators for generating children. The combination of these two contributions allows finding models that are on par or even outperform both hand-crafted as well as automatically-designed networks.
