Table of Contents
Fetching ...

Prior Distribution and Model Confidence

Maksim Kazanskii, Artem Kasianov

TL;DR

The paper addresses the problem of unreliable predictions under distribution shift in image classification by introducing Embedding Density, a model-agnostic, post-hoc confidence framework that assesses reliability through distances in embedding space to the training distribution. It presents a concrete four-step pipeline (base embedding database, nearest-neighbor retrieval, confidence gating via thresholds L and N, and selection of N* and L* using Normalized Confidence Gain) and defines a formal Confidence Curve, Confidence Gain, and Normalized Confidence Gain to quantify improvements. Empirical results across CNNs and ViTs show that filtering low-density predictions improves effective accuracy, with Embedding Density often matching logit-based OOD detectors on benchmarks and offering robustness when logits are unavailable; the method leverages training-manifold geometry and can generalize beyond vision. The work highlights that embedding quality (capacity, resolution, pretraining data) drives reliability gains, while naive ensemble approaches offer limited benefit, and argues for broader applicability to other modalities such as NLP. Overall, this approach provides a lightweight, scalable mechanism for drift detection and instance-level reliability in evolving data streams and foundation-model deployments.

Abstract

We study how the training data distribution affects confidence and performance in image classification models. We introduce Embedding Density, a model-agnostic framework that estimates prediction confidence by measuring the distance of test samples from the training distribution in embedding space, without requiring retraining. By filtering low-density (low-confidence) predictions, our method significantly improves classification accuracy. We evaluate Embedding Density across multiple architectures and compare it with state-of-the-art out-of-distribution (OOD) detection methods. The proposed approach is potentially generalizable beyond computer vision.

Prior Distribution and Model Confidence

TL;DR

The paper addresses the problem of unreliable predictions under distribution shift in image classification by introducing Embedding Density, a model-agnostic, post-hoc confidence framework that assesses reliability through distances in embedding space to the training distribution. It presents a concrete four-step pipeline (base embedding database, nearest-neighbor retrieval, confidence gating via thresholds L and N, and selection of N* and L* using Normalized Confidence Gain) and defines a formal Confidence Curve, Confidence Gain, and Normalized Confidence Gain to quantify improvements. Empirical results across CNNs and ViTs show that filtering low-density predictions improves effective accuracy, with Embedding Density often matching logit-based OOD detectors on benchmarks and offering robustness when logits are unavailable; the method leverages training-manifold geometry and can generalize beyond vision. The work highlights that embedding quality (capacity, resolution, pretraining data) drives reliability gains, while naive ensemble approaches offer limited benefit, and argues for broader applicability to other modalities such as NLP. Overall, this approach provides a lightweight, scalable mechanism for drift detection and instance-level reliability in evolving data streams and foundation-model deployments.

Abstract

We study how the training data distribution affects confidence and performance in image classification models. We introduce Embedding Density, a model-agnostic framework that estimates prediction confidence by measuring the distance of test samples from the training distribution in embedding space, without requiring retraining. By filtering low-density (low-confidence) predictions, our method significantly improves classification accuracy. We evaluate Embedding Density across multiple architectures and compare it with state-of-the-art out-of-distribution (OOD) detection methods. The proposed approach is potentially generalizable beyond computer vision.

Paper Structure

This paper contains 26 sections, 18 equations, 3 figures, 8 tables, 1 algorithm.

Figures (3)

  • Figure 1: The Confidence Curves and Confidence Gain for the ResNet-101 & DINO-V2 ViT-B/14.
  • Figure 2: Normalized Confidence Gain vs. number of neighbors $N$ for different classification models.
  • Figure 3: Confidence curves (accuracy vs. total coverage) for the ResNet50 model evaluated on the internal (left) and external (right) datasets using ensemble of embedding models.