Table of Contents
Fetching ...

Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection

Chao Chen, Zhihang Fu, Kai Liu, Ze Chen, Mingyuan Tao, Jieping Ye

TL;DR

This work tackles the challenge of detecting out-of-distribution (OOD) samples without costly retraining by introducing Optimal Parameter and Neuron Pruning (OPNP), a training-free method that prunes parameters and pre-logit neurons based on gradient-derived sensitivity to the energy score. By computing $M_{ij}= rac{1}{m}\sum_{k=1}^m|g_{ij}(m x_k)|$ and removing weights and neurons with extreme sensitivities, OPNP reduces overfitting and sharp decision boundaries, thereby improving ID–OOD separability. Across ImageNet-1K and CIFAR benchmarks with ResNet50 and ViT-B/16, OPNP—especially when pruning high-sensitivity components—consistently surpasses state-of-the-art post-hoc methods in FPR95 and AUROC and even enhances model calibration. The approach is compatible with other post-hoc methods and highlights the practical value of sensitivity-guided pruning for robust OOD detection in real-world deployments.

Abstract

For a machine learning model deployed in real world scenarios, the ability of detecting out-of-distribution (OOD) samples is indispensable and challenging. Most existing OOD detection methods focused on exploring advanced training skills or training-free tricks to prevent the model from yielding overconfident confidence score for unknown samples. The training-based methods require expensive training cost and rely on OOD samples which are not always available, while most training-free methods can not efficiently utilize the prior information from the training data. In this work, we propose an \textbf{O}ptimal \textbf{P}arameter and \textbf{N}euron \textbf{P}runing (\textbf{OPNP}) approach, which aims to identify and remove those parameters and neurons that lead to over-fitting. The main method is divided into two steps. In the first step, we evaluate the sensitivity of the model parameters and neurons by averaging gradients over all training samples. In the second step, the parameters and neurons with exceptionally large or close to zero sensitivities are removed for prediction. Our proposal is training-free, compatible with other post-hoc methods, and exploring the information from all training data. Extensive experiments are performed on multiple OOD detection tasks and model architectures, showing that our proposed OPNP consistently outperforms the existing methods by a large margin.

Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection

TL;DR

This work tackles the challenge of detecting out-of-distribution (OOD) samples without costly retraining by introducing Optimal Parameter and Neuron Pruning (OPNP), a training-free method that prunes parameters and pre-logit neurons based on gradient-derived sensitivity to the energy score. By computing and removing weights and neurons with extreme sensitivities, OPNP reduces overfitting and sharp decision boundaries, thereby improving ID–OOD separability. Across ImageNet-1K and CIFAR benchmarks with ResNet50 and ViT-B/16, OPNP—especially when pruning high-sensitivity components—consistently surpasses state-of-the-art post-hoc methods in FPR95 and AUROC and even enhances model calibration. The approach is compatible with other post-hoc methods and highlights the practical value of sensitivity-guided pruning for robust OOD detection in real-world deployments.

Abstract

For a machine learning model deployed in real world scenarios, the ability of detecting out-of-distribution (OOD) samples is indispensable and challenging. Most existing OOD detection methods focused on exploring advanced training skills or training-free tricks to prevent the model from yielding overconfident confidence score for unknown samples. The training-based methods require expensive training cost and rely on OOD samples which are not always available, while most training-free methods can not efficiently utilize the prior information from the training data. In this work, we propose an \textbf{O}ptimal \textbf{P}arameter and \textbf{N}euron \textbf{P}runing (\textbf{OPNP}) approach, which aims to identify and remove those parameters and neurons that lead to over-fitting. The main method is divided into two steps. In the first step, we evaluate the sensitivity of the model parameters and neurons by averaging gradients over all training samples. In the second step, the parameters and neurons with exceptionally large or close to zero sensitivities are removed for prediction. Our proposal is training-free, compatible with other post-hoc methods, and exploring the information from all training data. Extensive experiments are performed on multiple OOD detection tasks and model architectures, showing that our proposed OPNP consistently outperforms the existing methods by a large margin.
Paper Structure (23 sections, 12 equations, 9 figures, 9 tables)

This paper contains 23 sections, 12 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: Illustration of parameter sensitivity distribution for (a) ResNet50 and (b) ViT-B/16. The parameters are selected from the last fully-connected layer for both ResNet50 and ViT-B/16. The dotted line in red indicates the average sensitivity and the maximum sensitivity is normalized to 1.
  • Figure 2: (a) Illustration of the last fully-connected layer before and after OPNP, the connections and neurons in grey color represent the pruned ones. (b) The first row illustrate the parameter sensitivity before and after pruning and the second row illustrate the neuron sensitivity before and after pruning.
  • Figure 3: Effect of varying pruning percentage parameters in ResNet50 model. (a) Effect of varying $\rho_{max}^w$; (b) Effect of varying $\rho_{min}^w$ when set $\rho_{max}^w=0.5$; (c) Effect of varying $\rho_{max}^o$; (d) Effect of varying $\rho_{min}^o$ when set $\rho_{max}^o=30$. All numbers are percentages.
  • Figure 4: Illustration of confidence reliability diagrams. (a) Sample distribution histogram in different confidence bins. (b) Confidence reliability diagrams (CRD) in the original calibrated model. (c) CRD in the model with optimal parameter and neuron pruning.
  • Figure 5: Illustration of OOD score distributions in two tasks with the Energy baseline and our proposed OPNP. (a) Energy baseline in SUN benchmark. (b) OPNP in SUN benchmark. (c) Energy baseline in Places benchmark. (d) OPNP in Places benchmark.
  • ...and 4 more figures