Dense Optimizer : An Information Entropy-Guided Structural Search Method for Dense-like Neural Network Design
Liu Tianyuan, Hou Libin, Wang Linyuan, Song Xiyu, Yan Bin
TL;DR
The paper introduces Dense Optimizer, an information-entropy-guided method for automatically designing dense-like neural networks by maximizing multi-scale entropy under a power-law distribution across stages. It decouples architecture search from weight training and solves the resulting optimization with a branch-and-bound algorithm, achieving CPU-based search in about 4 hours. Empirical results on CIFAR-100 (and other datasets) show DenseNet-OPT can outperform manually designed DenseNets and several NAS baselines, achieving substantial accuracy gains. The approach offers a principled, scalable framework for dense architecture design and suggests broader applicability to dense-like networks beyond DenseNet.
Abstract
Dense Convolutional Network has been continuously refined to adopt a highly efficient and compact architecture, owing to its lightweight and efficient structure. However, the current Dense-like architectures are mainly designed manually, it becomes increasingly difficult to adjust the channels and reuse level based on past experience. As such, we propose an architecture search method called Dense Optimizer that can search high-performance dense-like network automatically. In Dense Optimizer, we view the dense network as a hierarchical information system, maximize the network's information entropy while constraining the distribution of the entropy across each stage via a power law, thereby constructing an optimization problem. We also propose a branch-and-bound optimization algorithm, tightly integrates power-law principle with search space scaling to solve the optimization problem efficiently. The superiority of Dense Optimizer has been validated on different computer vision benchmark datasets. Specifically, Dense Optimizer completes high-quality search but only costs 4 hours with one CPU. Our searched model DenseNet-OPT achieved a top 1 accuracy of 84.3% on CIFAR-100, which is 5.97% higher than the original one.
