Negative Label Guided OOD Detection with Pretrained Vision-Language Models
Xue Jiang, Feng Liu, Zhen Fang, Hong Chen, Tongliang Liu, Feng Zheng, Bo Han
TL;DR
This work tackles zero-shot OOD detection in vision-language models by enlarging the label space with a large set of negative labels sourced from broad corpora. It introduces NegLabel, a post hoc detector that uses NegMining to select discriminative negative labels and a sum-softmax OOD score that fuses affinities to ID labels and negative labels, with a grouping strategy to reduce variance. The authors provide a theoretical analysis linking negative labels to improved separability between ID and OOD via a multi-label framework and normal approximation, and demonstrate state-of-the-art performance across CLIP-like and other VLM architectures while showing robustness to domain shifts. The method is simple to deploy, does not require fine-tuning, and has practical implications for safer deployment of VLMs in open-world settings.
Abstract
Out-of-distribution (OOD) detection aims at identifying samples from unknown classes, playing a crucial role in trustworthy models against errors on unexpected inputs. Extensive research has been dedicated to exploring OOD detection in the vision modality. Vision-language models (VLMs) can leverage both textual and visual information for various multi-modal applications, whereas few OOD detection methods take into account information from the text modality. In this paper, we propose a novel post hoc OOD detection method, called NegLabel, which takes a vast number of negative labels from extensive corpus databases. We design a novel scheme for the OOD score collaborated with negative labels. Theoretical analysis helps to understand the mechanism of negative labels. Extensive experiments demonstrate that our method NegLabel achieves state-of-the-art performance on various OOD detection benchmarks and generalizes well on multiple VLM architectures. Furthermore, our method NegLabel exhibits remarkable robustness against diverse domain shifts. The codes are available at https://github.com/tmlr-group/NegLabel.
