Table of Contents
Fetching ...

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

Jian-Hao Luo, Jianxin Wu, Weiyao Lin

TL;DR

ThiNet presents a data-driven, filter-level pruning method that preserves the original CNN architecture while substantially reducing computation and storage. By pruning based on statistics from the next layer, ThiNet identifies weak filters through a data collection step, a greedy channel-selection algorithm, and a reconstruction-error minimization with ordinary least squares, followed by fine-tuning. The approach yields state-of-the-art reductions on VGG-16 and competitive gains on ResNet-50, with demonstrated transferability to domain-specific datasets. This framework enables efficient deployment on resource-constrained devices and can complement other compression techniques like quantization and low-rank approximations.

Abstract

We propose an efficient and unified framework, namely ThiNet, to simultaneously accelerate and compress CNN models in both training and inference stages. We focus on the filter level pruning, i.e., the whole filter would be discarded if it is less important. Our method does not change the original network structure, thus it can be perfectly supported by any off-the-shelf deep learning libraries. We formally establish filter pruning as an optimization problem, and reveal that we need to prune filters based on statistics information computed from its next layer, not the current layer, which differentiates ThiNet from existing methods. Experimental results demonstrate the effectiveness of this strategy, which has advanced the state-of-the-art. We also show the performance of ThiNet on ILSVRC-12 benchmark. ThiNet achieves 3.31$\times$ FLOPs reduction and 16.63$\times$ compression on VGG-16, with only 0.52$\%$ top-5 accuracy drop. Similar experiments with ResNet-50 reveal that even for a compact network, ThiNet can also reduce more than half of the parameters and FLOPs, at the cost of roughly 1$\%$ top-5 accuracy drop. Moreover, the original VGG-16 model can be further pruned into a very small model with only 5.05MB model size, preserving AlexNet level accuracy but showing much stronger generalization ability.

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

TL;DR

ThiNet presents a data-driven, filter-level pruning method that preserves the original CNN architecture while substantially reducing computation and storage. By pruning based on statistics from the next layer, ThiNet identifies weak filters through a data collection step, a greedy channel-selection algorithm, and a reconstruction-error minimization with ordinary least squares, followed by fine-tuning. The approach yields state-of-the-art reductions on VGG-16 and competitive gains on ResNet-50, with demonstrated transferability to domain-specific datasets. This framework enables efficient deployment on resource-constrained devices and can complement other compression techniques like quantization and low-rank approximations.

Abstract

We propose an efficient and unified framework, namely ThiNet, to simultaneously accelerate and compress CNN models in both training and inference stages. We focus on the filter level pruning, i.e., the whole filter would be discarded if it is less important. Our method does not change the original network structure, thus it can be perfectly supported by any off-the-shelf deep learning libraries. We formally establish filter pruning as an optimization problem, and reveal that we need to prune filters based on statistics information computed from its next layer, not the current layer, which differentiates ThiNet from existing methods. Experimental results demonstrate the effectiveness of this strategy, which has advanced the state-of-the-art. We also show the performance of ThiNet on ILSVRC-12 benchmark. ThiNet achieves 3.31 FLOPs reduction and 16.63 compression on VGG-16, with only 0.52 top-5 accuracy drop. Similar experiments with ResNet-50 reveal that even for a compact network, ThiNet can also reduce more than half of the parameters and FLOPs, at the cost of roughly 1 top-5 accuracy drop. Moreover, the original VGG-16 model can be further pruned into a very small model with only 5.05MB model size, preserving AlexNet level accuracy but showing much stronger generalization ability.

Paper Structure

This paper contains 15 sections, 7 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: Illustration of ThiNet. First, we focus on the dotted box part to determine several weak channels and their corresponding filters (highlighted in yellow in the first row). These channels (and their associated filters) have little contribution to the overall performance, thus can be discarded, leading to a pruned model. Finally, the network is fine-tuned to recover its accuracy. (This figure is best viewed in color.)
  • Figure 2: Illustration of data sampling and variables' relationship.
  • Figure 3: Illustration of the ResNet pruning strategy. For each residual block, we only prune the first two convolutional layers, keeping the block output dimension unchanged.
  • Figure 4: Performance comparison of different channel selection methods: the VGG-16-GAP model pruned on CUB-200 with different compression rates. (This figure is best viewed in color and zoomed in.)