Table of Contents
Fetching ...

Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection

Wenxi Chen, Raymond A. Yeh, Shaoshuai Mou, Yan Gu

TL;DR

Out-of-distribution detection remains challenging in open-world settings, especially for near-OOD cases. The paper introduces Perturbation Rectified OOD detection (PRO), a simple post-hoc method that minimizes an OOD score under controlled input perturbations to exploit the greater perturbation sensitivity of OOD inputs. By leveraging an adversarial score $g^{*}(oldsymbol{x})=\min_{\|\delta\|_{\infty}\le\epsilon} g(oldsymbol{x}+\delta)$ and using robust, adversarially trained models, PRO improves separability between IND and OOD scores without changing model architectures. Extensive experiments on OpenOOD and RobustBench show PRO achieving leading performance for small-scale models, particularly in near-OOD settings, and a notable FPR@95 reduction of over 10% in CIFAR-10 with adversarial training. These results highlight a practical link between adversarial robustness and OOD detection, offering a scalable post-hoc enhancement for safer deployment of vision systems.

Abstract

Out-of-distribution (OOD) detection is the task of identifying inputs that deviate from the training data distribution. This capability is essential for safely deploying deep computer vision models in open-world environments. In this work, we propose a post-hoc method, Perturbation-Rectified OOD detection (PRO), based on the insight that prediction confidence for OOD inputs is more susceptible to reduction under perturbation than in-distribution (IND) inputs. Based on the observation, we propose an adversarial score function that searches for the local minimum scores near the original inputs by applying gradient descent. This procedure enhances the separability between IND and OOD samples. Importantly, the approach improves OOD detection performance without complex modifications to the underlying model architectures. We conduct extensive experiments using the OpenOOD benchmark~\cite{yang2022openood}. Our approach further pushes the limit of softmax-based OOD detection and is the leading post-hoc method for small-scale models. On a CIFAR-10 model with adversarial training, PRO effectively detects near-OOD inputs, achieving a reduction of more than 10\% on FPR@95 compared to state-of-the-art methods.

Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection

TL;DR

Out-of-distribution detection remains challenging in open-world settings, especially for near-OOD cases. The paper introduces Perturbation Rectified OOD detection (PRO), a simple post-hoc method that minimizes an OOD score under controlled input perturbations to exploit the greater perturbation sensitivity of OOD inputs. By leveraging an adversarial score and using robust, adversarially trained models, PRO improves separability between IND and OOD scores without changing model architectures. Extensive experiments on OpenOOD and RobustBench show PRO achieving leading performance for small-scale models, particularly in near-OOD settings, and a notable FPR@95 reduction of over 10% in CIFAR-10 with adversarial training. These results highlight a practical link between adversarial robustness and OOD detection, offering a scalable post-hoc enhancement for safer deployment of vision systems.

Abstract

Out-of-distribution (OOD) detection is the task of identifying inputs that deviate from the training data distribution. This capability is essential for safely deploying deep computer vision models in open-world environments. In this work, we propose a post-hoc method, Perturbation-Rectified OOD detection (PRO), based on the insight that prediction confidence for OOD inputs is more susceptible to reduction under perturbation than in-distribution (IND) inputs. Based on the observation, we propose an adversarial score function that searches for the local minimum scores near the original inputs by applying gradient descent. This procedure enhances the separability between IND and OOD samples. Importantly, the approach improves OOD detection performance without complex modifications to the underlying model architectures. We conduct extensive experiments using the OpenOOD benchmark~\cite{yang2022openood}. Our approach further pushes the limit of softmax-based OOD detection and is the leading post-hoc method for small-scale models. On a CIFAR-10 model with adversarial training, PRO effectively detects near-OOD inputs, achieving a reduction of more than 10\% on FPR@95 compared to state-of-the-art methods.

Paper Structure

This paper contains 18 sections, 13 equations, 13 figures, 6 tables, 1 algorithm.

Figures (13)

  • Figure 1: Near-OOD detection performance tested on CIFAR-10 robust model diffenderfer2021winning. Near-OOD includes CIFAR-100 krizhevsky2009cifar and Tiny-ImageNet le2015tiny. Different markers distinguish the following baseline categories: feature-based methods, such as VIM wang2022vim and KNN sun2022out ($\lozenge$); energy liu2020energy and activation modification methods, such as Scale xu2024scaling ($\triangle$); gradient-based methods, such as ODIN liang2018enhancing and GradNorm huang2021importance ($\circ$); and softmax-based scores ($\square$). We apply PRO on MSP, Entropy hendrycks2016baseline, Temperature Scaling guo2017calibration, and GEN liu2023gen forming four PRO methods. Notably, the proposed PRO preprocessing significantly enhances the performance of softmax scores in distinguishing challenging near-OOD data.
  • Figure 2: Algorithm overview for the proposed Perturbation Rectified OOD (PRO) detection. (a) We conduct multi-step projected gradient descent on the input image during inference to minimize the OOD detection score function. Since the score for OOD data is expected to be more vulnerable to shifts under perturbations than IND data, this process enhances the separability between IND and OOD scores. (b) MSP score landscapes for two IND and OOD samples visualized by random projection li2018visualizing, more examples are provided in Fig. \ref{['figure:contoursplots']}.
  • Figure 3: Distribution plots of MSP score shift introduced by one-step gradient-based perturbation. OOD data endures more severe score shifts than IND data. The result is tested on a CIFAR-10 model with adversarial training diffenderfer2021winning.
  • Figure 4: AUROC performance on CIFAR-10 tested across baseline model yang2022openood and adversarially robust models (i.e., AugMix_ResNeXt, Binary, LRR-CARD-Deck, and LRR) diffenderfer2021winninghendrycks*2020augmix. PRO stably enhance four representative softmax scores, MSP, entropy hendrycks2016baseline, temperature scaling MSP-T guo2017calibration, and GEN liu2023gen.
  • Figure 5: AUROC performance of PRO methods tested on ImageNet. PRO works most well with data augmentation methods PixMix hendrycks2022pixmix and AugMix hendrycks*2020augmix, while the other two robust models NoisyMix erichson2024noisymix and SIN_IN geirhos2018imagenet have negative impacts on OOD detections. MSP, temperature scaling, and Entropy can still benefit from PRO to enhance near-OOD detection.
  • ...and 8 more figures

Theorems & Definitions (2)

  • Claim 1
  • proof