Your Classifier Can Be Secretly a Likelihood-Based OOD Detector

Jirayu Burapacheep; Yixuan Li

Your Classifier Can Be Secretly a Likelihood-Based OOD Detector

Jirayu Burapacheep, Yixuan Li

TL;DR

The paper tackles the challenge of reliable OOD detection for discriminative classifiers by introducing INK, a framework that imposes a probabilistic intrinsic likelihood on hyperspherical embeddings. By modeling latent representations with a mixture of von Mises–Fisher distributions on the unit sphere, INK defines an intrinsic score S(z) = \tau \cdot \log \sum_{j=1}^C \exp(\boldsymbol{\mu}_j^T \mathbf{z} / \tau), which is shown to be functionally equivalent to the log-likelihood p(\mathbf{z}) under uniform class priors. The authors demonstrate how standard maximum-likelihood training with vMF mixtures shapes the intrinsic likelihood to favor in-distribution data, provide theoretical links to likelihood-based OOD detection, and validate their approach on OpenOOD benchmarks where INK achieves state-of-the-art or competitive performance, often with substantially lower computation than KNN-based methods. They also extend the method to handle class imbalance and show robustness across architectures, including ResNet and ViT, with near- and far-OOD settings. Overall, INK offers a principled, efficient, and adaptable likelihood-based OOD detector for modern discriminative models.

Abstract

The ability to detect out-of-distribution (OOD) inputs is critical to guarantee the reliability of classification models deployed in an open environment. A fundamental challenge in OOD detection is that a discriminative classifier is typically trained to estimate the posterior probability p(y|z) for class y given an input z, but lacks the explicit likelihood estimation of p(z) ideally needed for OOD detection. While numerous OOD scoring functions have been proposed for classification models, these estimate scores are often heuristic-driven and cannot be rigorously interpreted as likelihood. To bridge the gap, we propose Intrinsic Likelihood (INK), which offers rigorous likelihood interpretation to modern discriminative-based classifiers. Specifically, our proposed INK score operates on the constrained latent embeddings of a discriminative classifier, which are modeled as a mixture of hyperspherical embeddings with constant norm. We draw a novel connection between the hyperspherical distribution and the intrinsic likelihood, which can be effectively optimized in modern neural networks. Extensive experiments on the OpenOOD benchmark empirically demonstrate that INK establishes a new state-of-the-art in a variety of OOD detection setups, including both far-OOD and near-OOD. Code is available at https://github.com/deeplearning-wisc/ink.

Your Classifier Can Be Secretly a Likelihood-Based OOD Detector

TL;DR

Abstract

Paper Structure (36 sections, 2 theorems, 18 equations, 5 figures, 9 tables)

This paper contains 36 sections, 2 theorems, 18 equations, 5 figures, 9 tables.

Introduction
Preliminaries
Methodology
Intrinsic Likelihood
Benefits of the probabilistic model.
Intrinsic likelihood for OOD detection.
Optimizing Intrinsic Likelihood in Deep Neural Networks
How does the loss function shape intrinsic likelihood?
Differences w.r.t. Existing Approaches
Experiments
Setup
Evaluation metrics and implementation details.
Main Results
Intrinsic likelihood score establishes state-of-the-art performance.
Near-OOD detection.
...and 21 more sections

Key Result

Theorem 3.1

Under uniform class prior, the intrinsic likelihood score $S(\mathbf{z})$ is a logarithmic function of the density $p(\mathbf{z})$, with a constant difference. The two measurements return the same level set for OOD detection.

Figures (5)

Figure 1: Overview of intrinsic likelihood framework for OOD detection. The neural network is trained on the in-distribution (ID) data, which lies on the unit hypersphere in the latent space. The intrinsic likelihood score is theoretically equivalent to the log-likelihood $\log p(\mathbf{z})$, which suits OOD detection.
Figure 2: Comparison of Energy vs. INK on ImageNet (ID) with ResNet-50, using OpenOOD benchmark zhang2023openood. INK consistently outperforms Energy on far-OOD and near-OOD datasets.
Figure 3: Density plots illustrating the distribution of energy score (left) and INK (right). Lighter and darker blue shades represent the distributions of scores for the ID and OOD datasets, respectively.
Figure 4: Ablation on test-time temperatures. The results are averaged across the far-OOD test sets and 3 random training runs, based on ResNet-34 trained on CIFAR-100.
Figure 5: UMAP visualization of a subset of ImageNet classes and OOD datasets. The class prototypes are designated by a star symbol $\texttt{*}$, while the OOD embeddings are distinguished by pink color.

Theorems & Definitions (6)

Definition 2.1: OOD Detection
Definition 2.2: Likelihood-based OOD Detector
Definition 3.1: von Mises-Fisher Distribution fisher1953dispersion
Definition 3.2: Intrinsic Likelihood Score
Theorem 3.1
Theorem A.1: Misalignment between energy score and likelihood

Your Classifier Can Be Secretly a Likelihood-Based OOD Detector

TL;DR

Abstract

Your Classifier Can Be Secretly a Likelihood-Based OOD Detector

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (6)