Table of Contents
Fetching ...

CLIPScope: Enhancing Zero-Shot OOD Detection with Bayesian Scoring

Hao Fu, Naman Patel, Prashanth Krishnamurthy, Farshad Khorrami

TL;DR

CLIPScope is introduced, a zero-shot OOD detection approach that normalizes the confidence score of a sample by class likelihoods, akin to a Bayesian posterior update, and introduces a novel strategy to mine OOD classes from a large lexical database.

Abstract

Detection of out-of-distribution (OOD) samples is crucial for safe real-world deployment of machine learning models. Recent advances in vision language foundation models have made them capable of detecting OOD samples without requiring in-distribution (ID) images. However, these zero-shot methods often underperform as they do not adequately consider ID class likelihoods in their detection confidence scoring. Hence, we introduce CLIPScope, a zero-shot OOD detection approach that normalizes the confidence score of a sample by class likelihoods, akin to a Bayesian posterior update. Furthermore, CLIPScope incorporates a novel strategy to mine OOD classes from a large lexical database. It selects class labels that are farthest and nearest to ID classes in terms of CLIP embedding distance to maximize coverage of OOD samples. We conduct extensive ablation studies and empirical evaluations, demonstrating state of the art performance of CLIPScope across various OOD detection benchmarks.

CLIPScope: Enhancing Zero-Shot OOD Detection with Bayesian Scoring

TL;DR

CLIPScope is introduced, a zero-shot OOD detection approach that normalizes the confidence score of a sample by class likelihoods, akin to a Bayesian posterior update, and introduces a novel strategy to mine OOD classes from a large lexical database.

Abstract

Detection of out-of-distribution (OOD) samples is crucial for safe real-world deployment of machine learning models. Recent advances in vision language foundation models have made them capable of detecting OOD samples without requiring in-distribution (ID) images. However, these zero-shot methods often underperform as they do not adequately consider ID class likelihoods in their detection confidence scoring. Hence, we introduce CLIPScope, a zero-shot OOD detection approach that normalizes the confidence score of a sample by class likelihoods, akin to a Bayesian posterior update. Furthermore, CLIPScope incorporates a novel strategy to mine OOD classes from a large lexical database. It selects class labels that are farthest and nearest to ID classes in terms of CLIP embedding distance to maximize coverage of OOD samples. We conduct extensive ablation studies and empirical evaluations, demonstrating state of the art performance of CLIPScope across various OOD detection benchmarks.
Paper Structure (24 sections, 1 theorem, 10 equations, 9 figures, 8 tables, 1 algorithm)

This paper contains 24 sections, 1 theorem, 10 equations, 9 figures, 8 tables, 1 algorithm.

Key Result

Lemma 3.1

With Bayes rule, the class likelihood of OOD samples is proportional to the global class likelihood, i.e., $\mathbb{P}\left (f(x,\mathcal{Y})=y_i ~|~ x\in\text{OOD} \right ) \propto \mathbb{P}(f(x,\mathcal{Y})=y_i ).$

Figures (9)

  • Figure 1: The process of CLIPScope involves three key stages: OOD label mining, prior estimate, and posterior update.
  • Figure 2: The nearest (red) and farthest (dark) labels are picked.
  • Figure 3: AUROC (%) on domain-shifted ID datasets. A higher AUROC implies a better performance.
  • Figure 4: FPR95 (%) on domain-shifted ID datasets. A lower FPR95 implies a better performance.
  • Figure 5: Performance (in %) of CLIPScope when applied to small ID datasets. The OOD datasets include iNaturalist, SUN, Places, and Textures. The reported numbers represent average results across these four OOD datasets.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Lemma 3.1
  • proof