Table of Contents
Fetching ...

DOS: Diverse Outlier Sampling for Out-of-Distribution Detection

Wenyu Jiang, Hao Cheng, Mingcai Chen, Chongjun Wang, Hongxin Wei

TL;DR

This paper tackles the problem of overconfident predictions on out-of-distribution inputs by introducing Diverse Outlier Sampling (DOS), a clustering-based strategy that selects diverse and informative outliers from an auxiliary OOD dataset and trains with an absent-category loss to shape a globally compact in-distribution versus out-of-distribution boundary. DOS partitions candidate outliers via normalized feature clustering, then picks the most informative exemplar from each cluster and uses a mini-batch scheme to maintain efficiency, with a loss that combines ID supervision and outlier regularization. Empirically, DOS achieves state-of-the-art OOD detection on common benchmarks and remains robust across large-scale datasets, different auxiliary OOD pools, and when combined with energy-based losses or large pre-trained models like CLIP. The method is simple to implement, scalable, and readily applicable in practice, offering substantial improvements in $FPR_{95}$ and strong generalization across settings, thereby enhancing safety in open-world deployments.

Abstract

Modern neural networks are known to give overconfident prediction for out-of-distribution inputs when deployed in the open world. It is common practice to leverage a surrogate outlier dataset to regularize the model during training, and recent studies emphasize the role of uncertainty in designing the sampling strategy for outlier dataset. However, the OOD samples selected solely based on predictive uncertainty can be biased towards certain types, which may fail to capture the full outlier distribution. In this work, we empirically show that diversity is critical in sampling outliers for OOD detection performance. Motivated by the observation, we propose a straightforward and novel sampling strategy named DOS (Diverse Outlier Sampling) to select diverse and informative outliers. Specifically, we cluster the normalized features at each iteration, and the most informative outlier from each cluster is selected for model training with absent category loss. With DOS, the sampled outliers efficiently shape a globally compact decision boundary between ID and OOD data. Extensive experiments demonstrate the superiority of DOS, reducing the average FPR95 by up to 25.79% on CIFAR-100 with TI-300K.

DOS: Diverse Outlier Sampling for Out-of-Distribution Detection

TL;DR

This paper tackles the problem of overconfident predictions on out-of-distribution inputs by introducing Diverse Outlier Sampling (DOS), a clustering-based strategy that selects diverse and informative outliers from an auxiliary OOD dataset and trains with an absent-category loss to shape a globally compact in-distribution versus out-of-distribution boundary. DOS partitions candidate outliers via normalized feature clustering, then picks the most informative exemplar from each cluster and uses a mini-batch scheme to maintain efficiency, with a loss that combines ID supervision and outlier regularization. Empirically, DOS achieves state-of-the-art OOD detection on common benchmarks and remains robust across large-scale datasets, different auxiliary OOD pools, and when combined with energy-based losses or large pre-trained models like CLIP. The method is simple to implement, scalable, and readily applicable in practice, offering substantial improvements in and strong generalization across settings, thereby enhancing safety in open-world deployments.

Abstract

Modern neural networks are known to give overconfident prediction for out-of-distribution inputs when deployed in the open world. It is common practice to leverage a surrogate outlier dataset to regularize the model during training, and recent studies emphasize the role of uncertainty in designing the sampling strategy for outlier dataset. However, the OOD samples selected solely based on predictive uncertainty can be biased towards certain types, which may fail to capture the full outlier distribution. In this work, we empirically show that diversity is critical in sampling outliers for OOD detection performance. Motivated by the observation, we propose a straightforward and novel sampling strategy named DOS (Diverse Outlier Sampling) to select diverse and informative outliers. Specifically, we cluster the normalized features at each iteration, and the most informative outlier from each cluster is selected for model training with absent category loss. With DOS, the sampled outliers efficiently shape a globally compact decision boundary between ID and OOD data. Extensive experiments demonstrate the superiority of DOS, reducing the average FPR95 by up to 25.79% on CIFAR-100 with TI-300K.
Paper Structure (41 sections, 6 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 41 sections, 6 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: A toy example in 2D space for illustration of different sampling strategies. The ID data consists of three class-conditional Gaussian distributions, and the OOD training samples are simulated with plenty of small-scale class-conditional Gaussian distributions away from ID data. (a): All outliers sampled: a global compact boundary but intractable. (b): Random outliers sampled: efficient, with a loose boundary. (c): Uncertain outliers sampled: efficient, with a locally compact boundary (See Subsection \ref{['sec:motivation']} for more empirical results). (d): Diverse and uncertain outliers sampled: efficient, with a globally compact boundary.
  • Figure 2: Comparisons among different sampling strategies. (a): The outliers (TI-300K) distribution across six clustering centers with greedy and uniform strategies. (b): The score distribution for ID (CIFAR-100) and OOD (All) using biased and uniform strategies. Compared with uniform sampling, biased sampling produces more OOD examples with high scores that are close to ID.
  • Figure 3: Comparison of the selected outliers between the greedy sampling and our proposed method in (a) diversity and (b) uncertainty. For the diversity, we adopt the label-independent clustering evaluation metric Calinski-Harabasz index calinski1974dendrite, which is the ratio of the sum of inter-cluster dispersion and of intra-cluster dispersion for all clusters. For the uncertainty, we use the softmax probability of the ($K$+1)-th class.
  • Figure 4: Results of different features.
  • Figure 5: AUROC at varying epochs.
  • ...and 1 more figures