Table of Contents
Fetching ...

Rethinking Out-of-Distribution Detection on Imbalanced Data Distribution

Kai Liu, Zhihang Fu, Sheng Jin, Chao Chen, Ze Chen, Rongxin Jiang, Fan Zhou, Yaowu Chen, Jieping Ye

TL;DR

A unified training-time regularization technique is presented to mitigate the bias and boost imbalanced OOD detectors across architecture designs and translates into consistent improvements on the representative CIFAR10-LT, CIFAR100-LT, and ImageNet-LT benchmarks against several state-of-the-art OOD detection approaches.

Abstract

Detecting and rejecting unknown out-of-distribution (OOD) samples is critical for deployed neural networks to void unreliable predictions. In real-world scenarios, however, the efficacy of existing OOD detection methods is often impeded by the inherent imbalance of in-distribution (ID) data, which causes significant performance decline. Through statistical observations, we have identified two common challenges faced by different OOD detectors: misidentifying tail class ID samples as OOD, while erroneously predicting OOD samples as head class from ID. To explain this phenomenon, we introduce a generalized statistical framework, termed ImOOD, to formulate the OOD detection problem on imbalanced data distribution. Consequently, the theoretical analysis reveals that there exists a class-aware bias item between balanced and imbalanced OOD detection, which contributes to the performance gap. Building upon this finding, we present a unified training-time regularization technique to mitigate the bias and boost imbalanced OOD detectors across architecture designs. Our theoretically grounded method translates into consistent improvements on the representative CIFAR10-LT, CIFAR100-LT, and ImageNet-LT benchmarks against several state-of-the-art OOD detection approaches. Code is available at https://github.com/alibaba/imood.

Rethinking Out-of-Distribution Detection on Imbalanced Data Distribution

TL;DR

A unified training-time regularization technique is presented to mitigate the bias and boost imbalanced OOD detectors across architecture designs and translates into consistent improvements on the representative CIFAR10-LT, CIFAR100-LT, and ImageNet-LT benchmarks against several state-of-the-art OOD detection approaches.

Abstract

Detecting and rejecting unknown out-of-distribution (OOD) samples is critical for deployed neural networks to void unreliable predictions. In real-world scenarios, however, the efficacy of existing OOD detection methods is often impeded by the inherent imbalance of in-distribution (ID) data, which causes significant performance decline. Through statistical observations, we have identified two common challenges faced by different OOD detectors: misidentifying tail class ID samples as OOD, while erroneously predicting OOD samples as head class from ID. To explain this phenomenon, we introduce a generalized statistical framework, termed ImOOD, to formulate the OOD detection problem on imbalanced data distribution. Consequently, the theoretical analysis reveals that there exists a class-aware bias item between balanced and imbalanced OOD detection, which contributes to the performance gap. Building upon this finding, we present a unified training-time regularization technique to mitigate the bias and boost imbalanced OOD detectors across architecture designs. Our theoretically grounded method translates into consistent improvements on the representative CIFAR10-LT, CIFAR100-LT, and ImageNet-LT benchmarks against several state-of-the-art OOD detection approaches. Code is available at https://github.com/alibaba/imood.
Paper Structure (25 sections, 2 theorems, 14 equations, 5 figures, 8 tables)

This paper contains 25 sections, 2 theorems, 14 equations, 5 figures, 8 tables.

Key Result

Lemma 3.1

For each ID class $y$ in open-set, there exists a non-negative variable $\gamma_y({\bm{x}})$, so that $P^{\text{bal}}(y|{\bm{x}}) = \gamma_y({\bm{x}}) \cdot \frac{P(y|{\bm{x}})}{P(y)}$, where $\gamma_y({\bm{x}}) = \frac{1}{K} \frac{P^{\text{bal}}({\bm{x}}|y)}{P({\bm{x}}|y)} \in (0, \infty)$, $P(y|{\

Figures (5)

  • Figure 1: Issues of OOD detection on imbalanced data. (a) Statistics of the class labels of ID samples that are wrongly detected as OOD, and the class predictions of OOD samples that are wrongly detected as ID. (b) Illustration of the OOD detection process in feature space. Head classes' huge decision space and tail classes' small decision space jointly damage the OOD detection.
  • Figure 2: Statistics on $\gamma$, $\beta$, and $\Delta$ from the CIFAR10-LT benchmark. (1) Upper: distributions on ID samples from head to tail (left to right) class indices; (2) Lower: distributions on OOD samples predicted as head to tail (left to right) ID classes.
  • Figure A1: Class-aware error statistics for OOD detection on different benchmarks.
  • Figure A2: Class-aware error statistics for different OOD detectors on CIFAR10-LT.
  • Figure A3: Statistics on correctly-detected ID and OOD samples.

Theorems & Definitions (3)

  • Lemma 3.1
  • Theorem 3.2
  • proof