Table of Contents
Fetching ...

FGP: Feature-Gradient-Prune for Efficient Convolutional Layer Pruning

Qingsong Lv, Jiasheng Sun, Sheng Zhou, Xu Zhang, Liangcheng Li, Yun Gao, Sun Qiao, Jie Song, Jiajun Bu

TL;DR

Experimental results demonstrate that the proposed method improves both model compactness and practicality while maintaining stable performance, and significantly reduces computational costs and minimizes accuracy loss compared to existing methods, highlighting its effectiveness in optimizing pruning outcomes.

Abstract

To reduce computational overhead while maintaining model performance, model pruning techniques have been proposed. Among these, structured pruning, which removes entire convolutional channels or layers, significantly enhances computational efficiency and is compatible with hardware acceleration. However, existing pruning methods that rely solely on image features or gradients often result in the retention of redundant channels, negatively impacting inference efficiency. To address this issue, this paper introduces a novel pruning method called Feature-Gradient Pruning (FGP). This approach integrates both feature-based and gradient-based information to more effectively evaluate the importance of channels across various target classes, enabling a more accurate identification of channels that are critical to model performance. Experimental results demonstrate that the proposed method improves both model compactness and practicality while maintaining stable performance. Experiments conducted across multiple tasks and datasets show that FGP significantly reduces computational costs and minimizes accuracy loss compared to existing methods, highlighting its effectiveness in optimizing pruning outcomes. The source code is available at: https://github.com/FGP-code/FGP.

FGP: Feature-Gradient-Prune for Efficient Convolutional Layer Pruning

TL;DR

Experimental results demonstrate that the proposed method improves both model compactness and practicality while maintaining stable performance, and significantly reduces computational costs and minimizes accuracy loss compared to existing methods, highlighting its effectiveness in optimizing pruning outcomes.

Abstract

To reduce computational overhead while maintaining model performance, model pruning techniques have been proposed. Among these, structured pruning, which removes entire convolutional channels or layers, significantly enhances computational efficiency and is compatible with hardware acceleration. However, existing pruning methods that rely solely on image features or gradients often result in the retention of redundant channels, negatively impacting inference efficiency. To address this issue, this paper introduces a novel pruning method called Feature-Gradient Pruning (FGP). This approach integrates both feature-based and gradient-based information to more effectively evaluate the importance of channels across various target classes, enabling a more accurate identification of channels that are critical to model performance. Experimental results demonstrate that the proposed method improves both model compactness and practicality while maintaining stable performance. Experiments conducted across multiple tasks and datasets show that FGP significantly reduces computational costs and minimizes accuracy loss compared to existing methods, highlighting its effectiveness in optimizing pruning outcomes. The source code is available at: https://github.com/FGP-code/FGP.

Paper Structure

This paper contains 18 sections, 11 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: The visualization results show one channel from each of the four convolutional layers, along with heatmaps for three classes. FGP retains Channel-2 and 4, and pruned model keeps only those channels with strong support values across all classes.
  • Figure 2: FGP pruning framework, where $\text{Conv}_{j}$ is used as an example. The process calculates each channel’s support value across all classes in the dataset. These support values are then summed to assess the overall importance of each channel. Channels are ranked by their importance scores, and the Top $k$ channels are selected. Pruning is then applied based on these rankings, resulting in a refined, pruned set of channels for the layer. Best viewed on screen.
  • Figure 3: Parametric experiments with Top $k$ and classes
  • Figure 4: Accuracy for different channel retention configurations