Table of Contents
Fetching ...

Detecting Out-of-Distribution Samples via Conditional Distribution Entropy with Optimal Transport

Chuanwen Feng, Wenlong Chen, Ao Ke, Yilong Ren, Xike Xie, S. Kevin Zhou

TL;DR

This work tackles out-of-distribution detection when test inputs are available by modeling training and test data as empirical distributions and measuring their geometric discrepancy through discrete entropic optimal transport. The authors introduce a conditional distribution entropy score derived from the OT transport plan, enabling a principled, parameterized measure of uncertainty that distinguishes ID from OOD samples. The method integrates supervised or self-supervised contrastive training to obtain compact, discriminative features and demonstrates state-of-the-art performance across benchmarks such as CIFAR-100 vs CIFAR-10 and in large semantic spaces, with efficient Sinkhorn-based computation. By combining pair- and population-wise information without distributional assumptions, the approach offers a practical, training-agnostic framework for robust OOD detection in open-world and continual-learning contexts.

Abstract

When deploying a trained machine learning model in the real world, it is inevitable to receive inputs from out-of-distribution (OOD) sources. For instance, in continual learning settings, it is common to encounter OOD samples due to the non-stationarity of a domain. More generally, when we have access to a set of test inputs, the existing rich line of OOD detection solutions, especially the recent promise of distance-based methods, falls short in effectively utilizing the distribution information from training samples and test inputs. In this paper, we argue that empirical probability distributions that incorporate geometric information from both training samples and test inputs can be highly beneficial for OOD detection in the presence of test inputs available. To address this, we propose to model OOD detection as a discrete optimal transport problem. Within the framework of optimal transport, we propose a novel score function known as the \emph{conditional distribution entropy} to quantify the uncertainty of a test input being an OOD sample. Our proposal inherits the merits of certain distance-based methods while eliminating the reliance on distribution assumptions, a-prior knowledge, and specific training mechanisms. Extensive experiments conducted on benchmark datasets demonstrate that our method outperforms its competitors in OOD detection.

Detecting Out-of-Distribution Samples via Conditional Distribution Entropy with Optimal Transport

TL;DR

This work tackles out-of-distribution detection when test inputs are available by modeling training and test data as empirical distributions and measuring their geometric discrepancy through discrete entropic optimal transport. The authors introduce a conditional distribution entropy score derived from the OT transport plan, enabling a principled, parameterized measure of uncertainty that distinguishes ID from OOD samples. The method integrates supervised or self-supervised contrastive training to obtain compact, discriminative features and demonstrates state-of-the-art performance across benchmarks such as CIFAR-100 vs CIFAR-10 and in large semantic spaces, with efficient Sinkhorn-based computation. By combining pair- and population-wise information without distributional assumptions, the approach offers a practical, training-agnostic framework for robust OOD detection in open-world and continual-learning contexts.

Abstract

When deploying a trained machine learning model in the real world, it is inevitable to receive inputs from out-of-distribution (OOD) sources. For instance, in continual learning settings, it is common to encounter OOD samples due to the non-stationarity of a domain. More generally, when we have access to a set of test inputs, the existing rich line of OOD detection solutions, especially the recent promise of distance-based methods, falls short in effectively utilizing the distribution information from training samples and test inputs. In this paper, we argue that empirical probability distributions that incorporate geometric information from both training samples and test inputs can be highly beneficial for OOD detection in the presence of test inputs available. To address this, we propose to model OOD detection as a discrete optimal transport problem. Within the framework of optimal transport, we propose a novel score function known as the \emph{conditional distribution entropy} to quantify the uncertainty of a test input being an OOD sample. Our proposal inherits the merits of certain distance-based methods while eliminating the reliance on distribution assumptions, a-prior knowledge, and specific training mechanisms. Extensive experiments conducted on benchmark datasets demonstrate that our method outperforms its competitors in OOD detection.
Paper Structure (21 sections, 4 theorems, 26 equations, 4 figures, 6 tables, 2 algorithms)

This paper contains 21 sections, 4 theorems, 26 equations, 4 figures, 6 tables, 2 algorithms.

Key Result

Theorem 3.2

The problem $\mathcal{L}_{\lambda}(\mu,\nu,\mathbf{C})$ has a unique optimal solution.

Figures (4)

  • Figure 1: An example of transport plan.
  • Figure 2: Performance with different number of test inputs on CIFAR-100 vs. CIFAR-10 (in AUROC).
  • Figure 3: Ablation studies on CIFAR-100 vs. CIFAR-10 (in AUROC).
  • Figure 4: An illustration of framework with OT. The shared encoder receives training samples with data augmentation and test inputs, which produce two kinds of feature representations, forming the corresponding distributions (red area and blue area). After the mass transportation, we can obtain the optimal transport plan, where the conditional distribution entropy of each test input can be derived as score function. A test sample can be identified as ID or OOD by comparing its score with threshold.

Theorems & Definitions (15)

  • Definition 2.1: OOD Detection
  • Definition 3.1
  • Theorem 3.2
  • Proposition 3.3
  • Definition 3.4: Conditional Distribution
  • Definition 3.5: Conditional Distribution Entropy
  • Proposition 3.6
  • proof
  • Definition A.1: Conditional Entropy
  • Remark A.2: Conditional Distribution Entropy and Joint Entropy in OT
  • ...and 5 more