Table of Contents
Fetching ...

Dense Optimizer : An Information Entropy-Guided Structural Search Method for Dense-like Neural Network Design

Liu Tianyuan, Hou Libin, Wang Linyuan, Song Xiyu, Yan Bin

TL;DR

The paper introduces Dense Optimizer, an information-entropy-guided method for automatically designing dense-like neural networks by maximizing multi-scale entropy under a power-law distribution across stages. It decouples architecture search from weight training and solves the resulting optimization with a branch-and-bound algorithm, achieving CPU-based search in about 4 hours. Empirical results on CIFAR-100 (and other datasets) show DenseNet-OPT can outperform manually designed DenseNets and several NAS baselines, achieving substantial accuracy gains. The approach offers a principled, scalable framework for dense architecture design and suggests broader applicability to dense-like networks beyond DenseNet.

Abstract

Dense Convolutional Network has been continuously refined to adopt a highly efficient and compact architecture, owing to its lightweight and efficient structure. However, the current Dense-like architectures are mainly designed manually, it becomes increasingly difficult to adjust the channels and reuse level based on past experience. As such, we propose an architecture search method called Dense Optimizer that can search high-performance dense-like network automatically. In Dense Optimizer, we view the dense network as a hierarchical information system, maximize the network's information entropy while constraining the distribution of the entropy across each stage via a power law, thereby constructing an optimization problem. We also propose a branch-and-bound optimization algorithm, tightly integrates power-law principle with search space scaling to solve the optimization problem efficiently. The superiority of Dense Optimizer has been validated on different computer vision benchmark datasets. Specifically, Dense Optimizer completes high-quality search but only costs 4 hours with one CPU. Our searched model DenseNet-OPT achieved a top 1 accuracy of 84.3% on CIFAR-100, which is 5.97% higher than the original one.

Dense Optimizer : An Information Entropy-Guided Structural Search Method for Dense-like Neural Network Design

TL;DR

The paper introduces Dense Optimizer, an information-entropy-guided method for automatically designing dense-like neural networks by maximizing multi-scale entropy under a power-law distribution across stages. It decouples architecture search from weight training and solves the resulting optimization with a branch-and-bound algorithm, achieving CPU-based search in about 4 hours. Empirical results on CIFAR-100 (and other datasets) show DenseNet-OPT can outperform manually designed DenseNets and several NAS baselines, achieving substantial accuracy gains. The approach offers a principled, scalable framework for dense architecture design and suggests broader applicability to dense-like networks beyond DenseNet.

Abstract

Dense Convolutional Network has been continuously refined to adopt a highly efficient and compact architecture, owing to its lightweight and efficient structure. However, the current Dense-like architectures are mainly designed manually, it becomes increasingly difficult to adjust the channels and reuse level based on past experience. As such, we propose an architecture search method called Dense Optimizer that can search high-performance dense-like network automatically. In Dense Optimizer, we view the dense network as a hierarchical information system, maximize the network's information entropy while constraining the distribution of the entropy across each stage via a power law, thereby constructing an optimization problem. We also propose a branch-and-bound optimization algorithm, tightly integrates power-law principle with search space scaling to solve the optimization problem efficiently. The superiority of Dense Optimizer has been validated on different computer vision benchmark datasets. Specifically, Dense Optimizer completes high-quality search but only costs 4 hours with one CPU. Our searched model DenseNet-OPT achieved a top 1 accuracy of 84.3% on CIFAR-100, which is 5.97% higher than the original one.

Paper Structure

This paper contains 16 sections, 10 equations, 3 figures, 6 tables, 1 algorithm.

Figures (3)

  • Figure 1: Visualization of the multi-scale entropy power-law distribution, which is based on the statistical results of dense backbone. The distribution of a dense backbone entropy under different feature size consistents with the power-law function
  • Figure 2: Power fit hyperparameters v.s. top-1 accuracy of each optimized model on CIFAR-100. From left to right: correlation of power fit hyperparameter a, b against accuracies on CIFAR-100. It reveals strong correlations in both hyperparameters, and maintain consistency across different growth rate settings.
  • Figure 3: From left to right, the figure displays the fitting errors, sum of squared residuals, coefficient of determination, and adjusted coefficient of determination for the high-performance dense network's multiscale information entropy using first-order and second-order polynomials, linear functions, power functions, and exponential functions. Among all the indicators, the power-law function exhibits the best fitting metrics.