A Unified Approach Towards Active Learning and Out-of-Distribution Detection
Sebastian Schmidt, Leonard Schenk, Leo Schwinn, Stephan Günnemann
TL;DR
The paper tackles the open-world challenge of needing labeled data for training while operating under unlabeled data and potentially unfamiliar inputs. It introduces SISOM, a unified framework that uses enriched feature-space distances, gradient-weighted feature representations, and a distance-ratio based sampling strategy to address both AL and OOD detection in a single module. A latent-space analysis and the self-balancing variant SISOMe further integrate uncertainty and diversity, achieving top ranks on OpenOOD benchmarks and strong AL performance across multiple datasets. This unified approach reduces deployment overhead, offers post-training latent-space refinement, and provides practical insights into the ambiguity between unlabeled and near-OOD data, with promising directions for open-set AL and batch diversification in future work.
Abstract
When applying deep learning models in open-world scenarios, active learning (AL) strategies are crucial for identifying label candidates from a nearly infinite amount of unlabeled data. In this context, robust out-of-distribution (OOD) detection mechanisms are essential for handling data outside the target distribution of the application. However, current works investigate both problems separately. In this work, we introduce SISOM as the first unified solution for both AL and OOD detection. By leveraging feature space distance metrics SISOM combines the strengths of the currently independent tasks to solve both effectively. We conduct extensive experiments showing the problems arising when migrating between both tasks. In these evaluations SISOM underlined its effectiveness by achieving first place in two of the widely used OpenOOD benchmarks and second place in the remaining one. In AL, SISOM outperforms others and delivers top-1 performance in three benchmarks
