Long-Tailed Out-of-Distribution Detection: Prioritizing Attention to Tail
Yina He, Lei Peng, Yongcun Zhang, Juanjuan Weng, Zhiming Luo, Shaozi Li
TL;DR
This work tackles long-tailed OOD detection by decoupling ID data balance from OOD separation and introducing PATT, a framework that combines temperature scaling-based implicit semantic augmentation with post-hoc feature calibration. The training component ISAC models ID features as a mixture of von Mises-Fisher distributions on a hypersphere, enabling a closed-form, infinitely sampled contrastive loss, while the classifier is sharpened via a temperature-scaled logit adjustment. During inference, a tail-focused attention mechanism recalibrates features to balance head and tail representations and to suppress OOD confidences, using an energy-based OOD score. Across CIFAR10/100-LT and ImageNet-LT, PATT yields substantial gains in AUROC and tail/class accuracy, outperforming prior long-tailed OOD methods and demonstrating strong robustness to hyperparameters and model architectures. The approach offers a practical, end-to-end solution with improved long-tailed recognition and reliable OOD detection in real-world settings.
Abstract
Current out-of-distribution (OOD) detection methods typically assume balanced in-distribution (ID) data, while most real-world data follow a long-tailed distribution. Previous approaches to long-tailed OOD detection often involve balancing the ID data by reducing the semantics of head classes. However, this reduction can severely affect the classification accuracy of ID data. The main challenge of this task lies in the severe lack of features for tail classes, leading to confusion with OOD data. To tackle this issue, we introduce a novel Prioritizing Attention to Tail (PATT) method using augmentation instead of reduction. Our main intuition involves using a mixture of von Mises-Fisher (vMF) distributions to model the ID data and a temperature scaling module to boost the confidence of ID data. This enables us to generate infinite contrastive pairs, implicitly enhancing the semantics of ID classes while promoting differentiation between ID and OOD data. To further strengthen the detection of OOD data without compromising the classification performance of ID data, we propose feature calibration during the inference phase. By extracting an attention weight from the training set that prioritizes the tail classes and reduces the confidence in OOD data, we improve the OOD detection capability. Extensive experiments verified that our method outperforms the current state-of-the-art methods on various benchmarks.
