Table of Contents
Fetching ...

Generalized Out-of-Distribution Detection: A Survey

Jingkang Yang, Kaiyang Zhou, Yixuan Li, Ziwei Liu

TL;DR

This survey introduces a unified generalized OOD detection framework that encompasses anomaly, novelty, open-set, out-of-distribution, and outlier detection. It provides a comprehensive taxonomy and analyzes methodologies across classification-based, density-based, distance-based, and reconstruction-based approaches, including theoretical foundations and foundation-model considerations. The work emphasizes fair benchmarking (e.g., CIFAR/OpenOOD), practical insights like the effectiveness of data augmentation and post-hoc methods, and highlights open challenges such as evaluation standards, outlier-free learning, and real-world large-scale benchmarks. By linking OOD detection with related sub-tasks, the paper advocates cross-task knowledge transfer and outlines directions for robust, open-world AI systems.

Abstract

Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems. For instance, in autonomous driving, we would like the driving system to issue an alert and hand over the control to humans when it detects unusual scenes or objects that it has never seen during training time and cannot make a safe decision. The term, OOD detection, first emerged in 2017 and since then has received increasing attention from the research community, leading to a plethora of methods developed, ranging from classification-based to density-based to distance-based ones. Meanwhile, several other problems, including anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD), are closely related to OOD detection in terms of motivation and methodology. Despite common goals, these topics develop in isolation, and their subtle differences in definition and problem setting often confuse readers and practitioners. In this survey, we first present a unified framework called generalized OOD detection, which encompasses the five aforementioned problems, i.e., AD, ND, OSR, OOD detection, and OD. Under our framework, these five problems can be seen as special cases or sub-tasks, and are easier to distinguish. We then review each of these five areas by summarizing their recent technical developments, with a special focus on OOD detection methodologies. We conclude this survey with open challenges and potential research directions.

Generalized Out-of-Distribution Detection: A Survey

TL;DR

This survey introduces a unified generalized OOD detection framework that encompasses anomaly, novelty, open-set, out-of-distribution, and outlier detection. It provides a comprehensive taxonomy and analyzes methodologies across classification-based, density-based, distance-based, and reconstruction-based approaches, including theoretical foundations and foundation-model considerations. The work emphasizes fair benchmarking (e.g., CIFAR/OpenOOD), practical insights like the effectiveness of data augmentation and post-hoc methods, and highlights open challenges such as evaluation standards, outlier-free learning, and real-world large-scale benchmarks. By linking OOD detection with related sub-tasks, the paper advocates cross-task knowledge transfer and outlines directions for robust, open-world AI systems.

Abstract

Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems. For instance, in autonomous driving, we would like the driving system to issue an alert and hand over the control to humans when it detects unusual scenes or objects that it has never seen during training time and cannot make a safe decision. The term, OOD detection, first emerged in 2017 and since then has received increasing attention from the research community, leading to a plethora of methods developed, ranging from classification-based to density-based to distance-based ones. Meanwhile, several other problems, including anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD), are closely related to OOD detection in terms of motivation and methodology. Despite common goals, these topics develop in isolation, and their subtle differences in definition and problem setting often confuse readers and practitioners. In this survey, we first present a unified framework called generalized OOD detection, which encompasses the five aforementioned problems, i.e., AD, ND, OSR, OOD detection, and OD. Under our framework, these five problems can be seen as special cases or sub-tasks, and are easier to distinguish. We then review each of these five areas by summarizing their recent technical developments, with a special focus on OOD detection methodologies. We conclude this survey with open challenges and potential research directions.

Paper Structure

This paper contains 34 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Taxonomy of generalized OOD detection framework, illustrated by classification tasks. Four bases are used for the task taxonomy: 1) Distribution shift to detect: the task focuses on detecting covariate shift or semantic shift; 2) ID data type: the ID data contains one single class or multiple classes; 3) Whether the task requires ID classification; 4) Transductive learning task requires all observations; inductive tasks follow the train-test scheme. Note that ND is often interchangeable with AD, but ND is more concerned with semantic anomalies. OOD detection is generally interchangeable with OSR for classification tasks.
  • Figure 2: Illustration of sub-tasks under generalized OOD detection framework with vision tasks. Tags on test images refer to model's expected predictions. (a) In sensory anomaly detection, test images with covariate shift will be considered as OOD. No semantic shift occurs in this setting. (b) In one-class novelty detection, normal/ID images belong to one class. Test images with semantic shift will be considered as OOD. (c) In multi-class novelty detection, ID images belong to multiple classes. Test images with semantic shift will be considered as OOD. Note that (b) and (c) compose novelty detection, which is identical to the topic of semantic anomaly detection. (d)Open set recognition is identical to multi-class novelty detection in the task of detection, with the only difference that open set recognition further requires ID classification. Out-of-distribution detection solves the same problem as open-set recognition. It canonically aims to detect test samples with semantic shift without losing the ID classification accuracy. However, OOD Detection encompasses a broader spectrum of learning tasks and solution space. (e)Outlier detection does not follow a train-test scheme. All observations are provided. It fits in the generalized OOD detection framework by defining the majority distribution as ID. Outliers can have any distribution shift from the majority.
  • Figure 3: Timeline for representative OOD detection methodologies. Different colors indicate different categories of methodologies. Each method has its corresponding reference (inconspicuous white) in the lower right corner. Methods with high citations and open-source code are prioritized for inclusion in this figure.
  • Figure 4: The illustration of CIFAR-10 benchmark that is used in Section \ref{['sec:benchmark']}. The CIFAR-100 benchmark simply swaps the position of CIFAR-10 and CIFAR-100 in the figure.
  • Figure 5: Comparison between different methodologies under generalized OOD detection framework on the CIFAR-10/100 benchmarks. Results are from OpenOOD yang2022openood. Different colors denote the method categories. Each method reports near-OOD (left-bar) and far-OOD (right-bar) AUROC scores, as introduced in Section \ref{['sec:exp_metrics']}. Method names in black originated for OOD detection, while in red are AD methods, blue for OSR methods, and pink for models from model uncertainty works.