When and How Does In-Distribution Label Help Out-of-Distribution Detection?
Xuefeng Du, Yiyou Sun, Yixuan Li
TL;DR
This work asks when and how in-distribution labels improve out-of-distribution detection. It builds a graph-based framework where ID data form a similarity graph and learns representations via spectral decomposition, which is shown to be equivalent to a contrastive objective. The authors derive a provable lower bound on the improvement in OOD detection accuracy when ID labels are used, expressed in terms of ID connectivity and ID–OOD coupling, and provide intuitive insights and a simplified bound for near vs far OOD regimes. They validate the theory with both synthetic and real datasets (e.g., CIFAR-10/100), demonstrating that ID labels yield notable gains in near-OOD scenarios and under certain connectivity conditions, with results robust to changes in the OOD distribution between training and evaluation. The work advances theoretical understanding of the ID–OOD relationship and offers practical guidance for leveraging ID labels in OOD-sensitive applications.
Abstract
Detecting data points deviating from the training distribution is pivotal for ensuring reliable machine learning. Extensive research has been dedicated to the challenge, spanning classical anomaly detection techniques to contemporary out-of-distribution (OOD) detection approaches. While OOD detection commonly relies on supervised learning from a labeled in-distribution (ID) dataset, anomaly detection may treat the entire ID data as a single class and disregard ID labels. This fundamental distinction raises a significant question that has yet to be rigorously explored: when and how does ID label help OOD detection? This paper bridges this gap by offering a formal understanding to theoretically delineate the impact of ID labels on OOD detection. We employ a graph-theoretic approach, rigorously analyzing the separability of ID data from OOD data in a closed-form manner. Key to our approach is the characterization of data representations through spectral decomposition on the graph. Leveraging these representations, we establish a provable error bound that compares the OOD detection performance with and without ID labels, unveiling conditions for achieving enhanced OOD detection. Lastly, we present empirical results on both simulated and real datasets, validating theoretical guarantees and reinforcing our insights. Code is publicly available at https://github.com/deeplearning-wisc/id_label.
