Table of Contents
Fetching ...

Confident magnitude-based neural network pruning

Joaquin Alvarez

TL;DR

The paper addresses safe, uncertainty-aware pruning of pretrained neural networks by introducing distribution-free, finite-sample guarantees for magnitude-based one-shot pruning. It leverages the Learn-Then-Test framework to calibrate the pruning ratio $\lambda$ on a dataset $\mathcal{D}_{cal}$, ensuring $\mathbb{P}(\mathbb{E}[\ell(\hat{Y}_{full}(X), \hat{Y}_{\lambda}(X))] \leq \alpha) \geq 1-\delta$ via a fixed-sequence, FWER-controlled procedure with super-uniform p-values. It supports labeled, unlabeled, and selective-prediction losses, and validates the approach on MNIST classification and PolypGen segmentation, demonstrating that meaningful sparsity can be achieved under rigorous risk guarantees. The work highlights the practical impact of formally calibrated pruning for safe and efficient deployment of sparse computer vision models, and discusses limitations and potential improvements with architecture-aware strategies and alternative calibration settings.

Abstract

Pruning neural networks has proven to be a successful approach to increase the efficiency and reduce the memory storage of deep learning models without compromising performance. Previous literature has shown that it is possible to achieve a sizable reduction in the number of parameters of a deep neural network without deteriorating its predictive capacity in one-shot pruning regimes. Our work builds beyond this background in order to provide rigorous uncertainty quantification for pruning neural networks reliably, which has not been addressed to a great extent in previous literature focusing on pruning methods in computer vision settings. We leverage recent techniques on distribution-free uncertainty quantification to provide finite-sample statistical guarantees to compress deep neural networks, while maintaining high performance. Moreover, this work presents experiments in computer vision tasks to illustrate how uncertainty-aware pruning is a useful approach to deploy sparse neural networks safely.

Confident magnitude-based neural network pruning

TL;DR

The paper addresses safe, uncertainty-aware pruning of pretrained neural networks by introducing distribution-free, finite-sample guarantees for magnitude-based one-shot pruning. It leverages the Learn-Then-Test framework to calibrate the pruning ratio on a dataset , ensuring via a fixed-sequence, FWER-controlled procedure with super-uniform p-values. It supports labeled, unlabeled, and selective-prediction losses, and validates the approach on MNIST classification and PolypGen segmentation, demonstrating that meaningful sparsity can be achieved under rigorous risk guarantees. The work highlights the practical impact of formally calibrated pruning for safe and efficient deployment of sparse computer vision models, and discusses limitations and potential improvements with architecture-aware strategies and alternative calibration settings.

Abstract

Pruning neural networks has proven to be a successful approach to increase the efficiency and reduce the memory storage of deep learning models without compromising performance. Previous literature has shown that it is possible to achieve a sizable reduction in the number of parameters of a deep neural network without deteriorating its predictive capacity in one-shot pruning regimes. Our work builds beyond this background in order to provide rigorous uncertainty quantification for pruning neural networks reliably, which has not been addressed to a great extent in previous literature focusing on pruning methods in computer vision settings. We leverage recent techniques on distribution-free uncertainty quantification to provide finite-sample statistical guarantees to compress deep neural networks, while maintaining high performance. Moreover, this work presents experiments in computer vision tasks to illustrate how uncertainty-aware pruning is a useful approach to deploy sparse neural networks safely.
Paper Structure (11 sections, 3 theorems, 24 equations, 8 figures, 2 tables, 3 algorithms)

This paper contains 11 sections, 3 theorems, 24 equations, 8 figures, 2 tables, 3 algorithms.

Key Result

Proposition 1

Let $\ell$ be a binary loss function (that is, taking values only in $\{0,1\}$). For any given $j\in \{0,1,\dots, Q-1\}$ we have that $p_j\coloneqq \mathbb{P}( Bin(n,\alpha)\leq n\Hat{R}_j)$ is a super-uniform p-value. That is,

Figures (8)

  • Figure 1: Visual representation of pruning a neural network. Thickness of the edges represents the magnitude of the weights. Thin edges were removed (weights set to zero) in the pruned neural network.
  • Figure 2: Neural network architecture for our MNIST experiments.
  • Figure 3: Pixel representations of the inputs in the dataset \ref{['mnist']} and average magnitude weight map in a pixel representation \ref{['prunedWeights']}.
  • Figure 4: Risk as a function of the pruning ratio in the calibration dataset.
  • Figure 5: Bootstrap distribution of the empirical risk with the loss function \ref{['unlabeled_loss']} on the validation dataset calibrating the pruning ratio with $\alpha=0.02$ and $B=10,000$ bootstrap resamples. Figure \ref{['fig:sub1']} corresponds to pruning with a naïve approach without accounting for uncertainty. Figure \ref{['fig:sub2']} was obtained calibrating with fixed sequence testing taking $\delta=0.10$.
  • ...and 3 more figures

Theorems & Definitions (3)

  • Proposition 1
  • Proposition 2
  • Proposition 3