Table of Contents
Fetching ...

AutoGrow: Automatic Layer Growing in Deep Convolutional Networks

Wei Wen, Feng Yan, Yiran Chen, Hai Li

TL;DR

The paper tackles the challenge of designing the depth of deep convolutional networks, a process traditionally guided by heuristics and costly trial-and-error. It introduces AutoGrow, a depth-growing algorithm that starts from shallow seeds and adds sub-modules when validation accuracy improves, using robust growth and stopping policies and various initializers. The approach is demonstrated across MNIST, CIFAR, SVHN, and ImageNet with ResNet and PlainNet backbones, showing near-optimal depth and favorable accuracy-computation trade-offs, including on ImageNet where manual depth design is outperformed. It also reveals that random initializations with periodic growth can outperform Network Morphism-based methods, and that the method scales in time comparable to training a single DNN, substantially reducing manual architecture search effort.

Abstract

Depth is a key component of Deep Neural Networks (DNNs), however, designing depth is heuristic and requires many human efforts. We propose AutoGrow to automate depth discovery in DNNs: starting from a shallow seed architecture, AutoGrow grows new layers if the growth improves the accuracy; otherwise, stops growing and thus discovers the depth. We propose robust growing and stopping policies to generalize to different network architectures and datasets. Our experiments show that by applying the same policy to different network architectures, AutoGrow can always discover near-optimal depth on various datasets of MNIST, FashionMNIST, SVHN, CIFAR10, CIFAR100 and ImageNet. For example, in terms of accuracy-computation trade-off, AutoGrow discovers a better depth combination in ResNets than human experts. Our AutoGrow is efficient. It discovers depth within similar time of training a single DNN. Our code is available at https://github.com/wenwei202/autogrow.

AutoGrow: Automatic Layer Growing in Deep Convolutional Networks

TL;DR

The paper tackles the challenge of designing the depth of deep convolutional networks, a process traditionally guided by heuristics and costly trial-and-error. It introduces AutoGrow, a depth-growing algorithm that starts from shallow seeds and adds sub-modules when validation accuracy improves, using robust growth and stopping policies and various initializers. The approach is demonstrated across MNIST, CIFAR, SVHN, and ImageNet with ResNet and PlainNet backbones, showing near-optimal depth and favorable accuracy-computation trade-offs, including on ImageNet where manual depth design is outperformed. It also reveals that random initializations with periodic growth can outperform Network Morphism-based methods, and that the method scales in time comparable to training a single DNN, substantially reducing manual architecture search effort.

Abstract

Depth is a key component of Deep Neural Networks (DNNs), however, designing depth is heuristic and requires many human efforts. We propose AutoGrow to automate depth discovery in DNNs: starting from a shallow seed architecture, AutoGrow grows new layers if the growth improves the accuracy; otherwise, stops growing and thus discovers the depth. We propose robust growing and stopping policies to generalize to different network architectures and datasets. Our experiments show that by applying the same policy to different network architectures, AutoGrow can always discover near-optimal depth on various datasets of MNIST, FashionMNIST, SVHN, CIFAR10, CIFAR100 and ImageNet. For example, in terms of accuracy-computation trade-off, AutoGrow discovers a better depth combination in ResNets than human experts. Our AutoGrow is efficient. It discovers depth within similar time of training a single DNN. Our code is available at https://github.com/wenwei202/autogrow.

Paper Structure

This paper contains 12 sections, 1 equation, 8 figures, 9 tables, 1 algorithm.

Figures (8)

  • Figure 1: A simple example of AutoGrow.
  • Figure 2: An optimization trajectory comparison between (a) Network Morphism and (b) training from scratch.
  • Figure 3: Optimization trajectory of AutoGrow, tested by Basic3ResNet on CIFAR10. (a) c-AutoGrow with staircase learning rate and ZeroInit during growing; (b) c-AutoGrow with constant learning rate and GauInit during growing; (c) p-AutoGrow with $K=50$; and (d) p-AutoGrow with $K=3$. For better illustration, the dots on the trajectory are plotted every $4$, $20$, $5$ and $3$ epochs in (a-d), respectively.
  • Figure 4: p-AutoGrow on CIFAR10 ($K=3$). The seed net is Basic3ResNet-1-1-1.
  • Figure 5: Loss surfaces around minima found by baselines and AutoGrow. Dataset is CIFAR10.
  • ...and 3 more figures