Chen-Yu Lee, Saining Xie, Patrick Gallagher, Zhengyou Zhang, Zhuowen Tu
Abstract
Our proposed deeply-supervised nets (DSN) method simultaneously minimizes
classification error while making the learning process of hidden layers direct
and transparent. We make an attempt to boost the classification performance by
studying a new formulation in deep networks. Three aspects in convolutional
neural networks (CNN) style architectures are being looked at: (1) transparency
of the intermediate layers to the overall classification; (2)
discriminativeness and robustness of learned features, especially in the early
layers; (3) effectiveness in training due to the presence of the exploding and
vanishing gradients. We introduce "companion objective" to the individual
hidden layers, in addition to the overall objective at the output layer (a
different strategy to layer-wise pre-training). We extend techniques from
stochastic gradient methods to analyze our algorithm. The advantage of our
method is evident and our experimental result on benchmark datasets shows
significant performance gain over existing methods (e.g. all state-of-the-art
results on MNIST, CIFAR-10, CIFAR-100, and SVHN).