Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection
Wenxi Chen, Raymond A. Yeh, Shaoshuai Mou, Yan Gu
TL;DR
Out-of-distribution detection remains challenging in open-world settings, especially for near-OOD cases. The paper introduces Perturbation Rectified OOD detection (PRO), a simple post-hoc method that minimizes an OOD score under controlled input perturbations to exploit the greater perturbation sensitivity of OOD inputs. By leveraging an adversarial score $g^{*}(oldsymbol{x})=\min_{\|\delta\|_{\infty}\le\epsilon} g(oldsymbol{x}+\delta)$ and using robust, adversarially trained models, PRO improves separability between IND and OOD scores without changing model architectures. Extensive experiments on OpenOOD and RobustBench show PRO achieving leading performance for small-scale models, particularly in near-OOD settings, and a notable FPR@95 reduction of over 10% in CIFAR-10 with adversarial training. These results highlight a practical link between adversarial robustness and OOD detection, offering a scalable post-hoc enhancement for safer deployment of vision systems.
Abstract
Out-of-distribution (OOD) detection is the task of identifying inputs that deviate from the training data distribution. This capability is essential for safely deploying deep computer vision models in open-world environments. In this work, we propose a post-hoc method, Perturbation-Rectified OOD detection (PRO), based on the insight that prediction confidence for OOD inputs is more susceptible to reduction under perturbation than in-distribution (IND) inputs. Based on the observation, we propose an adversarial score function that searches for the local minimum scores near the original inputs by applying gradient descent. This procedure enhances the separability between IND and OOD samples. Importantly, the approach improves OOD detection performance without complex modifications to the underlying model architectures. We conduct extensive experiments using the OpenOOD benchmark~\cite{yang2022openood}. Our approach further pushes the limit of softmax-based OOD detection and is the leading post-hoc method for small-scale models. On a CIFAR-10 model with adversarial training, PRO effectively detects near-OOD inputs, achieving a reduction of more than 10\% on FPR@95 compared to state-of-the-art methods.
