Table of Contents
Fetching ...

On the Learnability of Out-of-distribution Detection

Zhen Fang, Yixuan Li, Feng Liu, Bo Han, Jie Lu

TL;DR

This work initiates a PAC-learning-theoretic study of out-of-distribution (OOD) detection under two prevalent metrics, risk and AUC. It first proves that learnability in the full domain space is not guaranteed and identifies necessary (C1, C1_auc) and several impossibility results, especially when ID and OOD distributions overlap. The paper then isolates practical spaces—separate space, finite-ID-distribution space, and density-based space—where OOD detection can be learnable under either metric, with precise necessary/sufficient conditions (e.g., Con2, Risk-based Realizability, AUC Realizability) and explicit rates. It further connects theory to practice by mapping these results to fully-connected neural networks and score-based methods, providing guidance on when to expect learnability and how to design architectures accordingly. Overall, the work clarifies the limitations of universal OOD detectors and offers theoretical support for several representative OOD methods, while outlining future directions for near- and far-OOD detection scenarios and robustness aspects.

Abstract

Supervised learning aims to train a classifier under the assumption that training and test data are from the same distribution. To ease the above assumption, researchers have studied a more realistic setting: out-of-distribution (OOD) detection, where test data may come from classes that are unknown during training (i.e., OOD data). Due to the unavailability and diversity of OOD data, good generalization ability is crucial for effective OOD detection algorithms, and corresponding learning theory is still an open problem. To study the generalization of OOD detection, this paper investigates the probably approximately correct (PAC) learning theory of OOD detection that fits the commonly used evaluation metrics in the literature. First, we find a necessary condition for the learnability of OOD detection. Then, using this condition, we prove several impossibility theorems for the learnability of OOD detection under some scenarios. Although the impossibility theorems are frustrating, we find that some conditions of these impossibility theorems may not hold in some practical scenarios. Based on this observation, we next give several necessary and sufficient conditions to characterize the learnability of OOD detection in some practical scenarios. Lastly, we offer theoretical support for representative OOD detection works based on our OOD theory.

On the Learnability of Out-of-distribution Detection

TL;DR

This work initiates a PAC-learning-theoretic study of out-of-distribution (OOD) detection under two prevalent metrics, risk and AUC. It first proves that learnability in the full domain space is not guaranteed and identifies necessary (C1, C1_auc) and several impossibility results, especially when ID and OOD distributions overlap. The paper then isolates practical spaces—separate space, finite-ID-distribution space, and density-based space—where OOD detection can be learnable under either metric, with precise necessary/sufficient conditions (e.g., Con2, Risk-based Realizability, AUC Realizability) and explicit rates. It further connects theory to practice by mapping these results to fully-connected neural networks and score-based methods, providing guidance on when to expect learnability and how to design architectures accordingly. Overall, the work clarifies the limitations of universal OOD detectors and offers theoretical support for several representative OOD methods, while outlining future directions for near- and far-OOD detection scenarios and robustness aspects.

Abstract

Supervised learning aims to train a classifier under the assumption that training and test data are from the same distribution. To ease the above assumption, researchers have studied a more realistic setting: out-of-distribution (OOD) detection, where test data may come from classes that are unknown during training (i.e., OOD data). Due to the unavailability and diversity of OOD data, good generalization ability is crucial for effective OOD detection algorithms, and corresponding learning theory is still an open problem. To study the generalization of OOD detection, this paper investigates the probably approximately correct (PAC) learning theory of OOD detection that fits the commonly used evaluation metrics in the literature. First, we find a necessary condition for the learnability of OOD detection. Then, using this condition, we prove several impossibility theorems for the learnability of OOD detection under some scenarios. Although the impossibility theorems are frustrating, we find that some conditions of these impossibility theorems may not hold in some practical scenarios. Based on this observation, we next give several necessary and sufficient conditions to characterize the learnability of OOD detection in some practical scenarios. Lastly, we offer theoretical support for representative OOD detection works based on our OOD theory.
Paper Structure (72 sections, 68 theorems, 330 equations, 1 figure, 1 table)

This paper contains 72 sections, 68 theorems, 330 equations, 1 figure, 1 table.

Key Result

Theorem 1

Given spaces $\mathscr{D}_{XY}$ and $\mathscr{D}_{XY}'=\{D_{XY}^{\alpha}:\forall D_{XY}\in \mathscr{D}_{XY}, \forall \alpha\in [0,1)\}$, then 1) $\mathscr{D}_{XY}'$ is a priori-unknown space and $\mathscr{D}_{XY}\subset \mathscr{D}_{XY}'$; 2) if $\mathscr{D}_{XY}$ is a priori-unknown space, then Def

Figures (1)

  • Figure 1: Connections among main theoretical results in this paper. Compared to risk, AUC has a more strict requirement for the classification. Perfect classification (i.e., accuracy is 100%) does not imply perfect AUC, which is the reason why theories regarding risk and AUC are very different.

Theorems & Definitions (77)

  • Definition 1: Learnability of OOD Detection under Risk
  • Definition 2: Strong Learnability of OOD Detection under Risk
  • Definition 3: Learnability of OOD Detection under AUC
  • Definition 4
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Definition 5: Overlap Between ID and OOD
  • Lemma 1
  • Theorem 4: Impossibility Theorem for Total Space under Risk
  • ...and 67 more