SOOD-ImageNet: a Large-Scale Dataset for Semantic Out-Of-Distribution Image Classification and Semantic Segmentation
Alberto Bacchin, Davide Allegro, Stefano Ghidoni, Emanuele Menegatti
TL;DR
SOOD-ImageNet introduces a large-scale, semantically perturbed benchmark for semantic Out-Of-Distribution (SOOD) generalization in both image classification and semantic segmentation. It employs a novel data engine that combines language hierarchies, Vision-Language Models, CLIP-based scoring, and targeted human verification to create IID and two OOD splits (Easy and Hard) from ImageNet-21K-P, yielding approximately 1.6M images across 56 super-classes. Experimental results show that state-of-the-art DL models and large foundation models struggle with semantic shifts, with modest gains from data augmentation or pre-training, underscoring the challenge of SOOD generalization. The dataset and methodology enable scalable evaluation of SOOD across tasks and motivate future work on richer partitions, broader coverage, and potential OSR applications.
Abstract
Out-of-Distribution (OOD) detection in computer vision is a crucial research area, with related benchmarks playing a vital role in assessing the generalizability of models and their applicability in real-world scenarios. However, existing OOD benchmarks in the literature suffer from two main limitations: (1) they often overlook semantic shift as a potential challenge, and (2) their scale is limited compared to the large datasets used to train modern models. To address these gaps, we introduce SOOD-ImageNet, a novel dataset comprising around 1.6M images across 56 classes, designed for common computer vision tasks such as image classification and semantic segmentation under OOD conditions, with a particular focus on the issue of semantic shift. We ensured the necessary scalability and quality by developing an innovative data engine that leverages the capabilities of modern vision-language models, complemented by accurate human checks. Through extensive training and evaluation of various models on SOOD-ImageNet, we showcase its potential to significantly advance OOD research in computer vision. The project page is available at https://github.com/bach05/SOODImageNet.git.
