Table of Contents
Fetching ...

Pruning Deep Convolutional Neural Network Using Conditional Mutual Information

Tien Vu-Van, Dat Du Thanh, Nguyen Ho, Mai Vu

TL;DR

A structured filter-pruning approach for CNNs that identifies and selectively retains the most informative features in each layer based on Conditional Mutual Information values, computed using a matrix-based Renyi {\alpha}-order entropy numerical method.

Abstract

Convolutional Neural Networks (CNNs) achieve high performance in image classification tasks but are challenging to deploy on resource-limited hardware due to their large model sizes. To address this issue, we leverage Mutual Information, a metric that provides valuable insights into how deep learning models retain and process information through measuring the shared information between input features or output labels and network layers. In this study, we propose a structured filter-pruning approach for CNNs that identifies and selectively retains the most informative features in each layer. Our approach successively evaluates each layer by ranking the importance of its feature maps based on Conditional Mutual Information (CMI) values, computed using a matrix-based Renyi α-order entropy numerical method. We propose several formulations of CMI to capture correlation among features across different layers. We then develop various strategies to determine the cutoff point for CMI values to prune unimportant features. This approach allows parallel pruning in both forward and backward directions and significantly reduces model size while preserving accuracy. Tested on the VGG16 architecture with the CIFAR-10 dataset, the proposed method reduces the number of filters by more than a third, with only a 0.32% drop in test accuracy.

Pruning Deep Convolutional Neural Network Using Conditional Mutual Information

TL;DR

A structured filter-pruning approach for CNNs that identifies and selectively retains the most informative features in each layer based on Conditional Mutual Information values, computed using a matrix-based Renyi {\alpha}-order entropy numerical method.

Abstract

Convolutional Neural Networks (CNNs) achieve high performance in image classification tasks but are challenging to deploy on resource-limited hardware due to their large model sizes. To address this issue, we leverage Mutual Information, a metric that provides valuable insights into how deep learning models retain and process information through measuring the shared information between input features or output labels and network layers. In this study, we propose a structured filter-pruning approach for CNNs that identifies and selectively retains the most informative features in each layer. Our approach successively evaluates each layer by ranking the importance of its feature maps based on Conditional Mutual Information (CMI) values, computed using a matrix-based Renyi α-order entropy numerical method. We propose several formulations of CMI to capture correlation among features across different layers. We then develop various strategies to determine the cutoff point for CMI values to prune unimportant features. This approach allows parallel pruning in both forward and backward directions and significantly reduces model size while preserving accuracy. Tested on the VGG16 architecture with the CIFAR-10 dataset, the proposed method reduces the number of filters by more than a third, with only a 0.32% drop in test accuracy.

Paper Structure

This paper contains 29 sections, 15 equations, 4 figures, 7 tables, 6 algorithms.

Figures (4)

  • Figure 1: Example of ordered feature maps using cross-layer compact CMI computation in Alg. \ref{['alg:cmi_computation']}. The top left figure is the input image with label truck. The vertical axis presents the computed CMI value and the horizontal axis shows the index of the newly added ordered feature map.
  • Figure 2: Example of cutoff points by Scree test and X-Means.
  • Figure 3: Overview of the CMI-based pruning process. The blue curve shows a list of decreasing CMI values as new feature maps are sequentially added to the order set of each layer. The red vertical lines indicate candidate cutoff points for the CMI list. The important feature maps to be selected and retained are those to the left of the red lines.
  • Figure 4: Illustration of the process of a sample CNN model.