Block Pruning for Enhanced Efficiency in Convolutional Neural Networks
Cheng-En Wu, Azadeh Davoodi, Yu Hen Hu
TL;DR
The paper tackles the challenge of deploying CNNs on edge devices by proposing a direct block-pruning method that starts from pre-trained models and uses transfer learning after each pruning step. It identifies valid blocks, prunes them individually, and evaluates the exact impact on accuracy rather than relying on proxy metrics, thereby addressing dimensionality-mismatch issues without compensatory blocks. Experiments across CIFAR-10, CIFAR-100, and ImageNet with ResNet variants demonstrate that the approach can substantially reduce model size while maintaining high accuracy, especially on large-scale data such as ImageNet with ResNet50. The method consistently outperforms proxy-based baselines and shows robustness on deeper networks, highlighting its practical potential for resource-constrained edge environments.
Abstract
This paper presents a novel approach to network pruning, targeting block pruning in deep neural networks for edge computing environments. Our method diverges from traditional techniques that utilize proxy metrics, instead employing a direct block removal strategy to assess the impact on classification accuracy. This hands-on approach allows for an accurate evaluation of each block's importance. We conducted extensive experiments on CIFAR-10, CIFAR-100, and ImageNet datasets using ResNet architectures. Our results demonstrate the efficacy of our method, particularly on large-scale datasets like ImageNet with ResNet50, where it excelled in reducing model size while retaining high accuracy, even when pruning a significant portion of the network. The findings underscore our method's capability in maintaining an optimal balance between model size and performance, especially in resource-constrained edge computing scenarios.
