Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection
Chao Chen, Zhihang Fu, Kai Liu, Ze Chen, Mingyuan Tao, Jieping Ye
TL;DR
This work tackles the challenge of detecting out-of-distribution (OOD) samples without costly retraining by introducing Optimal Parameter and Neuron Pruning (OPNP), a training-free method that prunes parameters and pre-logit neurons based on gradient-derived sensitivity to the energy score. By computing $M_{ij}=rac{1}{m}\sum_{k=1}^m|g_{ij}(m x_k)|$ and removing weights and neurons with extreme sensitivities, OPNP reduces overfitting and sharp decision boundaries, thereby improving ID–OOD separability. Across ImageNet-1K and CIFAR benchmarks with ResNet50 and ViT-B/16, OPNP—especially when pruning high-sensitivity components—consistently surpasses state-of-the-art post-hoc methods in FPR95 and AUROC and even enhances model calibration. The approach is compatible with other post-hoc methods and highlights the practical value of sensitivity-guided pruning for robust OOD detection in real-world deployments.
Abstract
For a machine learning model deployed in real world scenarios, the ability of detecting out-of-distribution (OOD) samples is indispensable and challenging. Most existing OOD detection methods focused on exploring advanced training skills or training-free tricks to prevent the model from yielding overconfident confidence score for unknown samples. The training-based methods require expensive training cost and rely on OOD samples which are not always available, while most training-free methods can not efficiently utilize the prior information from the training data. In this work, we propose an \textbf{O}ptimal \textbf{P}arameter and \textbf{N}euron \textbf{P}runing (\textbf{OPNP}) approach, which aims to identify and remove those parameters and neurons that lead to over-fitting. The main method is divided into two steps. In the first step, we evaluate the sensitivity of the model parameters and neurons by averaging gradients over all training samples. In the second step, the parameters and neurons with exceptionally large or close to zero sensitivities are removed for prediction. Our proposal is training-free, compatible with other post-hoc methods, and exploring the information from all training data. Extensive experiments are performed on multiple OOD detection tasks and model architectures, showing that our proposed OPNP consistently outperforms the existing methods by a large margin.
