Table of Contents
Fetching ...

TASTE: Task-Aware Out-of-Distribution Detection via Stein Operators

Michał Kozyra, Gesine Reinert

TL;DR

TASTE introduces a task-aware OOD diagnostic by fixing a pretrained predictor $f_\theta$ and coupling it with a score-based model $p$ for the training distribution through the Langevin Stein operator $\mathcal{L}_p f(x)=\Delta f(x)+s_p(x)^{\top}\nabla f(x)$. The core quantity $S_f(p,q)=\mathbb{E}_q[\mathcal{L}_p f(X)]$ admits a geometric projection $S_f(p,q)=-\mathbb{E}_q[\nabla f(X)^{\top}\nabla\log\frac{q(X)}{p(X)}]$, making the signal task-aware by measuring shift components that align with the model’s input sensitivity. The framework provides robust bias-correction against approximate score models via $\tilde{S}_f(p,q)$, and enables per-sample and per-dimension residuals $r_f(x)$ and $r_{f,i}(x)$ for interpretable diagnostics, including per-pixel anomaly maps in images. Empirical results across controlled 2D shifts, MNIST rotations, CIFAR-10 benchmarks, and MVTec AD demonstrate that TASTE tracks task degradation and offers competitive OOD detection with meaningful localization, all without retraining or requiring negative samples. This task-aware operator approach thus bridges generative data geometry with discriminative model sensitivity, supporting reliable post-deployment monitoring and auditing of deep systems.

Abstract

Out-of-distribution detection methods are often either data-centric, detecting deviations from the training input distribution irrespective of their effect on a trained model, or model-centric, relying on classifier outputs without explicit reference to data geometry. We propose TASTE (Task-Aware STEin operators): a task-aware framework based on so-called Stein operators, which allows us to link distribution shift to the input sensitivity of the model. We show that the resulting operator admits a clear geometric interpretation as a projection of distribution shift onto the sensitivity field of the model, yielding theoretical guarantees. Beyond detecting the presence of a shift, the same construction enables its localisation through a coordinate-wise decomposition, and for image data-provides interpretable per-pixel diagnostics. Experiments on controlled Gaussian shifts, MNIST under geometric perturbations, and CIFAR-10 perturbed benchmarks demonstrate that the proposed method aligns closely with task degradation while outperforming established baselines.

TASTE: Task-Aware Out-of-Distribution Detection via Stein Operators

TL;DR

TASTE introduces a task-aware OOD diagnostic by fixing a pretrained predictor and coupling it with a score-based model for the training distribution through the Langevin Stein operator . The core quantity admits a geometric projection , making the signal task-aware by measuring shift components that align with the model’s input sensitivity. The framework provides robust bias-correction against approximate score models via , and enables per-sample and per-dimension residuals and for interpretable diagnostics, including per-pixel anomaly maps in images. Empirical results across controlled 2D shifts, MNIST rotations, CIFAR-10 benchmarks, and MVTec AD demonstrate that TASTE tracks task degradation and offers competitive OOD detection with meaningful localization, all without retraining or requiring negative samples. This task-aware operator approach thus bridges generative data geometry with discriminative model sensitivity, supporting reliable post-deployment monitoring and auditing of deep systems.

Abstract

Out-of-distribution detection methods are often either data-centric, detecting deviations from the training input distribution irrespective of their effect on a trained model, or model-centric, relying on classifier outputs without explicit reference to data geometry. We propose TASTE (Task-Aware STEin operators): a task-aware framework based on so-called Stein operators, which allows us to link distribution shift to the input sensitivity of the model. We show that the resulting operator admits a clear geometric interpretation as a projection of distribution shift onto the sensitivity field of the model, yielding theoretical guarantees. Beyond detecting the presence of a shift, the same construction enables its localisation through a coordinate-wise decomposition, and for image data-provides interpretable per-pixel diagnostics. Experiments on controlled Gaussian shifts, MNIST under geometric perturbations, and CIFAR-10 perturbed benchmarks demonstrate that the proposed method aligns closely with task degradation while outperforming established baselines.
Paper Structure (36 sections, 5 theorems, 80 equations, 11 figures, 6 tables, 1 algorithm)

This paper contains 36 sections, 5 theorems, 80 equations, 11 figures, 6 tables, 1 algorithm.

Key Result

Proposition 4.1

Under mild regularity (see Appendix app:projection),

Figures (11)

  • Figure 1: Task-aware intuition behind TASTE. Blue contours depict the score-shift field between the training density p(x) and test density q(x). TASTE measures -- without explicit knowledge of q -- how distribution shift aligns with the input sensitivity of the model $\nabla f(x)$: shifts aligned with $\nabla f$ (red) produce a large response, while orthogonal shifts (green) are largely ignored even if they are large in density terms.
  • Figure 2: Directional shift experiment. Prediction error and the task-aware Stein signal vary strongly with the direction of shift, peaking (in magnitude) when the shift aligns with the sensitive direction $(1,-1)$. In contrast, the density-based score remains nearly constant for all directions.
  • Figure 3: MNIST under geometric perturbations. Top: classification accuracy. Bottom: corresponding Stein signal. Translations (left) leave both accuracy and Stein score unchanged, while rotations (right) induce a monotonic increase in the Stein signal aligned with performance degradation.
  • Figure 4: Demonstration of per-pixel anomaly heatmaps for the MNIST-based task. Input digits are shown on the left, while the heatmaps are on the right. Note that both original and translated digits do not generate significant anomaly signal.
  • Figure 5: Additional qualitative results. Each panel is a randomly chosen example for one of MVTec AD categories. Within each panel there are four subplots: the original image (with corruption), detector's heatmap, prediction overlay at $\alpha = 0.01$, prediction and ground truth (GT) overlay.
  • ...and 6 more figures

Theorems & Definitions (11)

  • Proposition 4.1: Projection identity
  • Definition 1.1: Stein class for $\mathcal{L}_p$
  • Proposition 1.2: Stein identity
  • proof
  • proof : Proof of Proposition \ref{['prop:projection-main']}
  • Proposition 1.4
  • proof
  • Proposition 1.5: Directional decomposition
  • proof
  • Proposition 1.6: Fisher-controlled stability to score-model error
  • ...and 1 more