On the Learnability of Out-of-distribution Detection
Zhen Fang, Yixuan Li, Feng Liu, Bo Han, Jie Lu
TL;DR
This work initiates a PAC-learning-theoretic study of out-of-distribution (OOD) detection under two prevalent metrics, risk and AUC. It first proves that learnability in the full domain space is not guaranteed and identifies necessary (C1, C1_auc) and several impossibility results, especially when ID and OOD distributions overlap. The paper then isolates practical spaces—separate space, finite-ID-distribution space, and density-based space—where OOD detection can be learnable under either metric, with precise necessary/sufficient conditions (e.g., Con2, Risk-based Realizability, AUC Realizability) and explicit rates. It further connects theory to practice by mapping these results to fully-connected neural networks and score-based methods, providing guidance on when to expect learnability and how to design architectures accordingly. Overall, the work clarifies the limitations of universal OOD detectors and offers theoretical support for several representative OOD methods, while outlining future directions for near- and far-OOD detection scenarios and robustness aspects.
Abstract
Supervised learning aims to train a classifier under the assumption that training and test data are from the same distribution. To ease the above assumption, researchers have studied a more realistic setting: out-of-distribution (OOD) detection, where test data may come from classes that are unknown during training (i.e., OOD data). Due to the unavailability and diversity of OOD data, good generalization ability is crucial for effective OOD detection algorithms, and corresponding learning theory is still an open problem. To study the generalization of OOD detection, this paper investigates the probably approximately correct (PAC) learning theory of OOD detection that fits the commonly used evaluation metrics in the literature. First, we find a necessary condition for the learnability of OOD detection. Then, using this condition, we prove several impossibility theorems for the learnability of OOD detection under some scenarios. Although the impossibility theorems are frustrating, we find that some conditions of these impossibility theorems may not hold in some practical scenarios. Based on this observation, we next give several necessary and sufficient conditions to characterize the learnability of OOD detection in some practical scenarios. Lastly, we offer theoretical support for representative OOD detection works based on our OOD theory.
