Data-free parameter pruning for Deep Neural Networks
Suraj Srinivas, R. Venkatesh Babu
TL;DR
The paper introduces a data-free, neuron-level pruning technique that identifies and removes redundant or near-identical neurons by a saliency-based criterion, then merges their contributions ('surgery'). It connects conceptually to Optimal Brain Damage and Knowledge Distillation while avoiding training data, enabling rapid pruning of fully connected layers. Across LeNet and AlexNet-style networks, the method achieves substantial parameter reductions with minimal accuracy loss and demonstrates superior speed compared to traditional pruning methods. A practical heuristic based on saliency curves guides the pruning extent, making the approach scalable to large networks. Overall, the work offers a fast, data-free path to compress networks with little performance penalty, applicable to common FC-layer architectures.
Abstract
Deep Neural nets (NNs) with millions of parameters are at the heart of many state-of-the-art computer vision systems today. However, recent works have shown that much smaller models can achieve similar levels of performance. In this work, we address the problem of pruning parameters in a trained NN model. Instead of removing individual weights one at a time as done in previous works, we remove one neuron at a time. We show how similar neurons are redundant, and propose a systematic way to remove them. Our experiments in pruning the densely connected layers show that we can remove upto 85\% of the total parameters in an MNIST-trained network, and about 35\% for AlexNet without significantly affecting performance. Our method can be applied on top of most networks with a fully connected layer to give a smaller network.
