Table of Contents
Fetching ...

Less is More: The Influence of Pruning on the Explainability of CNNs

Florian Merkle, David Weber, Pascal Schöttle, Stephan Schlögl, Martin Nocker

TL;DR

This work investigates whether reducing CNN parameter counts via pruning can enhance explainability to humans. Using Grad-CAM heatmaps on a VGG-16 model trained on Imagenette, the authors conduct a three-phase human-grounded study across compression rates {2,4,8,32}, revealing a sweet spot at mild pruning (CR≈2) that improves both explainability and, in some cases, accuracy. Heavier pruning (CR≥8) generally degrades human interpretability and task performance, underscoring a trade-off between model simplicity and faithful explanations. The study contributes a human-centered evaluation framework for pruning-driven explainability and highlights practical implications for deploying CNNs in resource-constrained settings while maintaining trust and transparency.

Abstract

Over the last century, deep learning models have become the state-of-the-art for solving complex computer vision problems. These modern computer vision models have millions of parameters, which presents two major challenges: (1) the increased computational requirements hamper the deployment in resource-constrained environments, such as mobile or IoT devices, and (2) explaining the complex decisions of such networks to humans is challenging. Network pruning is a technical approach to reduce the complexity of models, where less important parameters are removed. The work presented in this paper investigates whether this reduction in technical complexity also helps with perceived explainability. To do so, we conducted a pre-study and two human-grounded experiments, assessing the effects of different pruning ratios on explainability. Overall, we evaluate four different compression rates (i.e., 2, 4, 8, and 32) with 37 500 tasks on Mechanical Turk. Results indicate that lower compression rates have a positive influence on explainability, while higher compression rates show negative effects. Furthermore, we were able to identify sweet spots that increase both the perceived explainability and the model's performance.

Less is More: The Influence of Pruning on the Explainability of CNNs

TL;DR

This work investigates whether reducing CNN parameter counts via pruning can enhance explainability to humans. Using Grad-CAM heatmaps on a VGG-16 model trained on Imagenette, the authors conduct a three-phase human-grounded study across compression rates {2,4,8,32}, revealing a sweet spot at mild pruning (CR≈2) that improves both explainability and, in some cases, accuracy. Heavier pruning (CR≥8) generally degrades human interpretability and task performance, underscoring a trade-off between model simplicity and faithful explanations. The study contributes a human-centered evaluation framework for pruning-driven explainability and highlights practical implications for deploying CNNs in resource-constrained settings while maintaining trust and transparency.

Abstract

Over the last century, deep learning models have become the state-of-the-art for solving complex computer vision problems. These modern computer vision models have millions of parameters, which presents two major challenges: (1) the increased computational requirements hamper the deployment in resource-constrained environments, such as mobile or IoT devices, and (2) explaining the complex decisions of such networks to humans is challenging. Network pruning is a technical approach to reduce the complexity of models, where less important parameters are removed. The work presented in this paper investigates whether this reduction in technical complexity also helps with perceived explainability. To do so, we conducted a pre-study and two human-grounded experiments, assessing the effects of different pruning ratios on explainability. Overall, we evaluate four different compression rates (i.e., 2, 4, 8, and 32) with 37 500 tasks on Mechanical Turk. Results indicate that lower compression rates have a positive influence on explainability, while higher compression rates show negative effects. Furthermore, we were able to identify sweet spots that increase both the perceived explainability and the model's performance.
Paper Structure (27 sections, 2 equations, 18 figures, 8 tables, 1 algorithm)

This paper contains 27 sections, 2 equations, 18 figures, 8 tables, 1 algorithm.

Figures (18)

  • Figure 1: Which algorithm is more reasonable? In the middle, we show the original picture; on the left, the explainability heat map of compression rate 1; and on the right, the explainability heat map for compression rate 8. Red colors indicate more important regions, while blue colors indicate less important regions.
  • Figure 2: Top-1 test-set accuracies (dark blue, left y-axis), human rater accuracies (light blue, left y-axis), and our explainability measure (orange, right y-axis) for different compression rates.
  • Figure 3: Example images of the Imagenette dataset with their labels.
  • Figure 4: An image of the class 'dog', its heat- and occlusion map based on Grad-CAM and our calculations
  • Figure 5: Experimental Setup - Experiment 2: occlusion map (CR 1).
  • ...and 13 more figures