Table of Contents
Fetching ...

What If the Input is Expanded in OOD Detection?

Boxuan Zhang, Jianing Zhu, Zengmao Wang, Tongliang Liu, Bo Du, Bo Han

TL;DR

A new scoring method, namely, Confidence aVerage (CoVer), is formalized, which can capture the dynamic differences by simply averaging the scores obtained from different corrupted inputs and the original ones, making the OOD and ID distributions more separable in detection tasks.

Abstract

Out-of-distribution (OOD) detection aims to identify OOD inputs from unknown classes, which is important for the reliable deployment of machine learning models in the open world. Various scoring functions are proposed to distinguish it from in-distribution (ID) data. However, existing methods generally focus on excavating the discriminative information from a single input, which implicitly limits its representation dimension. In this work, we introduce a novel perspective, i.e., employing different common corruptions on the input space, to expand that. We reveal an interesting phenomenon termed confidence mutation, where the confidence of OOD data can decrease significantly under the corruptions, while the ID data shows a higher confidence expectation considering the resistance of semantic features. Based on that, we formalize a new scoring method, namely, Confidence aVerage (CoVer), which can capture the dynamic differences by simply averaging the scores obtained from different corrupted inputs and the original ones, making the OOD and ID distributions more separable in detection tasks. Extensive experiments and analyses have been conducted to understand and verify the effectiveness of CoVer. The code is publicly available at: https://github.com/tmlr-group/CoVer.

What If the Input is Expanded in OOD Detection?

TL;DR

A new scoring method, namely, Confidence aVerage (CoVer), is formalized, which can capture the dynamic differences by simply averaging the scores obtained from different corrupted inputs and the original ones, making the OOD and ID distributions more separable in detection tasks.

Abstract

Out-of-distribution (OOD) detection aims to identify OOD inputs from unknown classes, which is important for the reliable deployment of machine learning models in the open world. Various scoring functions are proposed to distinguish it from in-distribution (ID) data. However, existing methods generally focus on excavating the discriminative information from a single input, which implicitly limits its representation dimension. In this work, we introduce a novel perspective, i.e., employing different common corruptions on the input space, to expand that. We reveal an interesting phenomenon termed confidence mutation, where the confidence of OOD data can decrease significantly under the corruptions, while the ID data shows a higher confidence expectation considering the resistance of semantic features. Based on that, we formalize a new scoring method, namely, Confidence aVerage (CoVer), which can capture the dynamic differences by simply averaging the scores obtained from different corrupted inputs and the original ones, making the OOD and ID distributions more separable in detection tasks. Extensive experiments and analyses have been conducted to understand and verify the effectiveness of CoVer. The code is publicly available at: https://github.com/tmlr-group/CoVer.

Paper Structure

This paper contains 71 sections, 1 theorem, 22 equations, 12 figures, 21 tables.

Key Result

Lemma D.3

Assuming the variation relationships between $\mu_{p i}$ and $\mu_{p o}$, and between $\sigma_{p i}$ and $\sigma_{p o}$, CoVer enables a lower $\mathrm{FPR}_{\lambda}$.

Figures (12)

  • Figure 1: Comparison of scores distributions and detection results with different inputs for representation dimension expansion. Left panel: results with a single original input; Middle panel: results with a single corrupted input, which perform worse but have mutated scores for some OOD samples (see Figure \ref{['fig:explanation1']}); Right panel: results with multiple inputs (CoVer), which achieve the variance reduction for the ID distribution and perform a better ID-OOD separability (see Figure \ref{['fig:explanation2']} for more explanations).
  • Figure 2: Demonstration about detailed explanations for the discovery illustrated in Figure \ref{['fig:main_framework']}. The ID and OOD data here are divided into four groups, i.e., Confident ID, Unconfident ID, Overconfident OOD, and Unconfident OOD. First Row: the variation of confidence scores for ID and OOD data before and after being corrupted. The critical difference lies in the greater confidence declination for overconfident OOD data compared to unconfident ID data. (see Figure \ref{['fig:explanation2']} for further discussion). Second Row: scatter maps of confidence scores sampled from the four groups under the same corruption, statistically supporting the findings of the first row. See Appendix \ref{['app:ablation']} for more details.
  • Figure 3: Visual exploration of random unconfident ID samples and the confidence mutation exemplified on random overconfident OOD samples under the same corruption. For each original input and its corrupted variant, we leverage the Fast Fourier Transformation to extract their low-frequency and high-frequency parts. Left panel: visual investigation on unconfident ID samples with ID semantic features at low-frequency levels that are resistant to corruptions. Right panel: an intuitive comparison of overconfident OOD samples, whose confidences show significant changes due to the elimination of non-semantic features at the high-frequency level. See Appendix \ref{['app:visualization:2']} for more detailed analyses.
  • Figure 4: Overview of CoVer. Left panel: visualization of the raw input and inputs w.r.t different corruptions; Left-middle panel: procedures of logit outputs from single-modal and multi-modal networks; Right-middle panel: scoring functions that equip each dimensional output with an OOD score; Right panel: realization of CoVer by averaging OOD scores obtained from multiple dimensions.
  • Figure 5: Ablation Study. (a) superiority of the multi-dimensional scoring framework; (b) exploration of different quantity of expanded input dimensions; (c) using different severity levels of a specific corruption type; (d) comparison with different realizations for each dimensional confidence score.
  • ...and 7 more figures

Theorems & Definitions (2)

  • Definition 3.1: Confidence Difference
  • Lemma D.3: Declination of $\mathrm{FPR}_{\lambda}$