Table of Contents
Fetching ...

Out-Of-Distribution Detection with Diversification (Provably)

Haiyun Yao, Zongbo Han, Huazhu Fu, Xi Peng, Qinghua Hu, Changqing Zhang

TL;DR

This work proposes a simple yet practical approach with a theoretical guarantee, termed Diversity-induced Mixup for OOD detection (diverseMix), which enhances the diversity of auxiliary outlier set for training in an efficient way and achieves superior performance on commonly used and recent challenging large-scale benchmarks.

Abstract

Out-of-distribution (OOD) detection is crucial for ensuring reliable deployment of machine learning models. Recent advancements focus on utilizing easily accessible auxiliary outliers (e.g., data from the web or other datasets) in training. However, we experimentally reveal that these methods still struggle to generalize their detection capabilities to unknown OOD data, due to the limited diversity of the auxiliary outliers collected. Therefore, we thoroughly examine this problem from the generalization perspective and demonstrate that a more diverse set of auxiliary outliers is essential for enhancing the detection capabilities. However, in practice, it is difficult and costly to collect sufficiently diverse auxiliary outlier data. Therefore, we propose a simple yet practical approach with a theoretical guarantee, termed Diversity-induced Mixup for OOD detection (diverseMix), which enhances the diversity of auxiliary outlier set for training in an efficient way. Extensive experiments show that diverseMix achieves superior performance on commonly used and recent challenging large-scale benchmarks, which further confirm the importance of the diversity of auxiliary outliers.

Out-Of-Distribution Detection with Diversification (Provably)

TL;DR

This work proposes a simple yet practical approach with a theoretical guarantee, termed Diversity-induced Mixup for OOD detection (diverseMix), which enhances the diversity of auxiliary outlier set for training in an efficient way and achieves superior performance on commonly used and recent challenging large-scale benchmarks.

Abstract

Out-of-distribution (OOD) detection is crucial for ensuring reliable deployment of machine learning models. Recent advancements focus on utilizing easily accessible auxiliary outliers (e.g., data from the web or other datasets) in training. However, we experimentally reveal that these methods still struggle to generalize their detection capabilities to unknown OOD data, due to the limited diversity of the auxiliary outliers collected. Therefore, we thoroughly examine this problem from the generalization perspective and demonstrate that a more diverse set of auxiliary outliers is essential for enhancing the detection capabilities. However, in practice, it is difficult and costly to collect sufficiently diverse auxiliary outlier data. Therefore, we propose a simple yet practical approach with a theoretical guarantee, termed Diversity-induced Mixup for OOD detection (diverseMix), which enhances the diversity of auxiliary outlier set for training in an efficient way. Extensive experiments show that diverseMix achieves superior performance on commonly used and recent challenging large-scale benchmarks, which further confirm the importance of the diversity of auxiliary outliers.

Paper Structure

This paper contains 32 sections, 4 theorems, 46 equations, 2 figures, 10 tables, 1 algorithm.

Key Result

Theorem 1

(Generalization Bound of OOD Detector). We let $\mathcal{D}_{train} = \mathcal{D}_{id}\cup \mathcal{D}_{aux}$, consisting of $M$ samples. For any hypothesis $h \in \mathcal{H}$ and $0 < \delta < 1$, with a probability of at least $1 - \delta$, the following inequality holds:

Figures (2)

  • Figure 1: OOD score for different training strategies. The ID data $\mathcal{X}_{in}\subset \mathbb{R}^2$ is sampled from three distinct Gaussian distributions, each representing a different class. The auxiliary outliers are sampled from a Gaussian mixture model away from the ID data, where the number of mixture components indicates the number of classes contained in auxiliary outliers. (a) The model trained without auxiliary outliers fails to detect OOD. (b) Incorporating a less diverse set of auxiliary outliers (10 classes) during training enables partial OOD detection, but overfits auxiliary outliers. (c) OOD detection is improved with a more diverse set of auxiliary outliers (1000 classes). (d) diverseMix enriches the diversity of outliers (10 classes) through creating significantly distinct mixed outliers.
  • Figure 2: Comparison of OOD detection performance on CIFAR-100 with decreased quality of auxiliary outlier datasets (a) With constant diversity of auxiliary outliers (1000 categories), the dataset size is decreased. The x-axis represents the percentage of the original outlier dataset's size used for training. (b) With fixed dataset size (10% of auxiliary outliers), the diversity of outliers is decreased, with the x-axis displaying the number of categories. See Appendix \ref{['q2']} for more details.

Theorems & Definitions (5)

  • Theorem 1
  • Definition 1
  • Theorem 2
  • Lemma 1
  • Theorem 3