TASTE: Task-Aware Out-of-Distribution Detection via Stein Operators
Michał Kozyra, Gesine Reinert
TL;DR
TASTE introduces a task-aware OOD diagnostic by fixing a pretrained predictor $f_\theta$ and coupling it with a score-based model $p$ for the training distribution through the Langevin Stein operator $\mathcal{L}_p f(x)=\Delta f(x)+s_p(x)^{\top}\nabla f(x)$. The core quantity $S_f(p,q)=\mathbb{E}_q[\mathcal{L}_p f(X)]$ admits a geometric projection $S_f(p,q)=-\mathbb{E}_q[\nabla f(X)^{\top}\nabla\log\frac{q(X)}{p(X)}]$, making the signal task-aware by measuring shift components that align with the model’s input sensitivity. The framework provides robust bias-correction against approximate score models via $\tilde{S}_f(p,q)$, and enables per-sample and per-dimension residuals $r_f(x)$ and $r_{f,i}(x)$ for interpretable diagnostics, including per-pixel anomaly maps in images. Empirical results across controlled 2D shifts, MNIST rotations, CIFAR-10 benchmarks, and MVTec AD demonstrate that TASTE tracks task degradation and offers competitive OOD detection with meaningful localization, all without retraining or requiring negative samples. This task-aware operator approach thus bridges generative data geometry with discriminative model sensitivity, supporting reliable post-deployment monitoring and auditing of deep systems.
Abstract
Out-of-distribution detection methods are often either data-centric, detecting deviations from the training input distribution irrespective of their effect on a trained model, or model-centric, relying on classifier outputs without explicit reference to data geometry. We propose TASTE (Task-Aware STEin operators): a task-aware framework based on so-called Stein operators, which allows us to link distribution shift to the input sensitivity of the model. We show that the resulting operator admits a clear geometric interpretation as a projection of distribution shift onto the sensitivity field of the model, yielding theoretical guarantees. Beyond detecting the presence of a shift, the same construction enables its localisation through a coordinate-wise decomposition, and for image data-provides interpretable per-pixel diagnostics. Experiments on controlled Gaussian shifts, MNIST under geometric perturbations, and CIFAR-10 perturbed benchmarks demonstrate that the proposed method aligns closely with task degradation while outperforming established baselines.
