Table of Contents
Fetching ...

Prototype-based Optimal Transport for Out-of-Distribution Detection

Ao Ke, Wenlong Chen, Chuanwen Feng, Yukun Cao, Xike Xie, S. Kevin Zhou, Lei Feng

TL;DR

The paper addresses the challenge of detecting out-of-distribution inputs in deep networks by measuring distribution discrepancy between test samples and in-distribution (ID) prototypes. It introduces Prototype-based Optimal Transport (POT), which uses OT between test representations and class prototypes to obtain per-sample transport costs, and augments this with linear-extrapolated virtual outliers to boost detection of near-ID OOD samples. The final OOD score is a contrast between the transport costs to ID prototypes and to the virtual outliers, computed efficiently via entropic regularization and the Sinkhorn-Knopp algorithm. Empirically, POT achieves state-of-the-art performance on CIFAR-100 and ImageNet-1k across Far-OOD and Near-OOD benchmarks, is compatible with various training-time schemes, and remains effective even when training data are unavailable, highlighting its practical impact for reliable deployment.

Abstract

Detecting Out-of-Distribution (OOD) inputs is crucial for improving the reliability of deep neural networks in the real-world deployment. In this paper, inspired by the inherent distribution shift between ID and OOD data, we propose a novel method that leverages optimal transport to measure the distribution discrepancy between test inputs and ID prototypes. The resulting transport costs are used to quantify the individual contribution of each test input to the overall discrepancy, serving as a desirable measure for OOD detection. To address the issue that solely relying on the transport costs to ID prototypes is inadequate for identifying OOD inputs closer to ID data, we generate virtual outliers to approximate the OOD region via linear extrapolation. By combining the transport costs to ID prototypes with the costs to virtual outliers, the detection of OOD data near ID data is emphasized, thereby enhancing the distinction between ID and OOD inputs. Experiments demonstrate the superiority of our method over state-of-the-art methods.

Prototype-based Optimal Transport for Out-of-Distribution Detection

TL;DR

The paper addresses the challenge of detecting out-of-distribution inputs in deep networks by measuring distribution discrepancy between test samples and in-distribution (ID) prototypes. It introduces Prototype-based Optimal Transport (POT), which uses OT between test representations and class prototypes to obtain per-sample transport costs, and augments this with linear-extrapolated virtual outliers to boost detection of near-ID OOD samples. The final OOD score is a contrast between the transport costs to ID prototypes and to the virtual outliers, computed efficiently via entropic regularization and the Sinkhorn-Knopp algorithm. Empirically, POT achieves state-of-the-art performance on CIFAR-100 and ImageNet-1k across Far-OOD and Near-OOD benchmarks, is compatible with various training-time schemes, and remains effective even when training data are unavailable, highlighting its practical impact for reliable deployment.

Abstract

Detecting Out-of-Distribution (OOD) inputs is crucial for improving the reliability of deep neural networks in the real-world deployment. In this paper, inspired by the inherent distribution shift between ID and OOD data, we propose a novel method that leverages optimal transport to measure the distribution discrepancy between test inputs and ID prototypes. The resulting transport costs are used to quantify the individual contribution of each test input to the overall discrepancy, serving as a desirable measure for OOD detection. To address the issue that solely relying on the transport costs to ID prototypes is inadequate for identifying OOD inputs closer to ID data, we generate virtual outliers to approximate the OOD region via linear extrapolation. By combining the transport costs to ID prototypes with the costs to virtual outliers, the detection of OOD data near ID data is emphasized, thereby enhancing the distinction between ID and OOD inputs. Experiments demonstrate the superiority of our method over state-of-the-art methods.

Paper Structure

This paper contains 14 sections, 14 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: Illustration of our method for OOD detection. In (a), the representation distribution of OOD inputs is distinctly separated from ID inputs, visualized via t-SNE. The model is ResNet18 tech:ResNet. The ID/OOD data is CIFAR-10 OOD_Dataset:CIFAR and SVHN OOD_Dataset:SVHN. (b) shows a slice of the transport cost matrix, which is derived from the optimal transport between test inputs and ID prototypes (depicted as triangles). The row sum of a test input (labelled from A to L) represents the transport cost from it to all ID prototypes. Darker colors indicate higher transport costs. It is evident that the ID inputs (depicted as orange circles) generally incur lower transport costs compared to the OOD inputs (depicted as blue circles).
  • Figure 2: Illustration of virtual outlier generation. The average representation of test inputs $\mathcal{M}$ lies between the average of ID inputs $\mathcal{M}_\text{in}$ and average of OOD inputs $\mathcal{M}_\text{out}$. We generate virtual outliers to approximate the OOD region using linear extrapolation between $\mathcal{M}$ and ID prototypes.
  • Figure 3: Ablation study on the effect of virtual outliers. We contrast the distribution for the transport cost score without virtual outliers (a & c) and the contrastive transport cost score with virtual outliers (b & d). The used models are ResNet-18 for CIFAR-100 and ViT-b16 for ImageNet-1k, respectively. The introduction of virtual outliers makes a more distinguishable score, leading to enhanced OOD detection performance.
  • Figure 4: Ablation across different parameters in POT including: (a) test batch size; (b) entropic regularization coefficient $\lambda$ of OT; (c) linear extrapolation parameter $\omega$ in generating virtual outlier.