Table of Contents
Fetching ...

Block Pruning for Enhanced Efficiency in Convolutional Neural Networks

Cheng-En Wu, Azadeh Davoodi, Yu Hen Hu

TL;DR

The paper tackles the challenge of deploying CNNs on edge devices by proposing a direct block-pruning method that starts from pre-trained models and uses transfer learning after each pruning step. It identifies valid blocks, prunes them individually, and evaluates the exact impact on accuracy rather than relying on proxy metrics, thereby addressing dimensionality-mismatch issues without compensatory blocks. Experiments across CIFAR-10, CIFAR-100, and ImageNet with ResNet variants demonstrate that the approach can substantially reduce model size while maintaining high accuracy, especially on large-scale data such as ImageNet with ResNet50. The method consistently outperforms proxy-based baselines and shows robustness on deeper networks, highlighting its practical potential for resource-constrained edge environments.

Abstract

This paper presents a novel approach to network pruning, targeting block pruning in deep neural networks for edge computing environments. Our method diverges from traditional techniques that utilize proxy metrics, instead employing a direct block removal strategy to assess the impact on classification accuracy. This hands-on approach allows for an accurate evaluation of each block's importance. We conducted extensive experiments on CIFAR-10, CIFAR-100, and ImageNet datasets using ResNet architectures. Our results demonstrate the efficacy of our method, particularly on large-scale datasets like ImageNet with ResNet50, where it excelled in reducing model size while retaining high accuracy, even when pruning a significant portion of the network. The findings underscore our method's capability in maintaining an optimal balance between model size and performance, especially in resource-constrained edge computing scenarios.

Block Pruning for Enhanced Efficiency in Convolutional Neural Networks

TL;DR

The paper tackles the challenge of deploying CNNs on edge devices by proposing a direct block-pruning method that starts from pre-trained models and uses transfer learning after each pruning step. It identifies valid blocks, prunes them individually, and evaluates the exact impact on accuracy rather than relying on proxy metrics, thereby addressing dimensionality-mismatch issues without compensatory blocks. Experiments across CIFAR-10, CIFAR-100, and ImageNet with ResNet variants demonstrate that the approach can substantially reduce model size while maintaining high accuracy, especially on large-scale data such as ImageNet with ResNet50. The method consistently outperforms proxy-based baselines and shows robustness on deeper networks, highlighting its practical potential for resource-constrained edge environments.

Abstract

This paper presents a novel approach to network pruning, targeting block pruning in deep neural networks for edge computing environments. Our method diverges from traditional techniques that utilize proxy metrics, instead employing a direct block removal strategy to assess the impact on classification accuracy. This hands-on approach allows for an accurate evaluation of each block's importance. We conducted extensive experiments on CIFAR-10, CIFAR-100, and ImageNet datasets using ResNet architectures. Our results demonstrate the efficacy of our method, particularly on large-scale datasets like ImageNet with ResNet50, where it excelled in reducing model size while retaining high accuracy, even when pruning a significant portion of the network. The findings underscore our method's capability in maintaining an optimal balance between model size and performance, especially in resource-constrained edge computing scenarios.
Paper Structure (10 sections, 6 figures)

This paper contains 10 sections, 6 figures.

Figures (6)

  • Figure 1: The diagram shows a deep neural network composed of several stacked Convolutional (Conv) blocks, each of which can be classified into 10 distinct types. We have enforced a limitation that restricts pruning on the truncated Conv block (which is the initial Conv in the network), each of the first blocks within the Conv Blocks, as well as the final fully connected (FC) layer.
  • Figure 2: ResNet20 on Cifar10
  • Figure 3: ResNet20 on Cifar100
  • Figure 4: ResNet56 on Cifar10
  • Figure 5: ResNet56 Cifar100
  • ...and 1 more figures