A View on Out-of-Distribution Identification from a Statistical Testing Theory Perspective

Alberto Caron; Chris Hicks; Vasilios Mavroudis

A View on Out-of-Distribution Identification from a Statistical Testing Theory Perspective

Alberto Caron, Chris Hicks, Vasilios Mavroudis

TL;DR

The paper reframes out-of-distribution detection as a non-parametric statistical testing problem and establishes identifiability conditions that determine when OOD can be reliably detected. It introduces a Wasserstein distance-based test statistic and proves both asymptotic uniform consistency under a separation condition and non-asymptotic power bounds, clarifying the limits of detectability. The analysis argues for the advantages of distributional-distance tests over KL/JS-based methods, especially in high-dimensional and non-overlapping regimes. Two experiments, a synthetic generative-model task and an MNIST versus Fashion-MNIST setup, demonstrate the practical effectiveness of the Wasserstein OOD test for detecting distributional shifts at test time.

Abstract

We study the problem of efficiently detecting Out-of-Distribution (OOD) samples at test time in supervised and unsupervised learning contexts. While ML models are typically trained under the assumption that training and test data stem from the same distribution, this is often not the case in realistic settings, thus reliably detecting distribution shifts is crucial at deployment. We re-formulate the OOD problem under the lenses of statistical testing and then discuss conditions that render the OOD problem identifiable in statistical terms. Building on this framework, we study convergence guarantees of an OOD test based on the Wasserstein distance, and provide a simple empirical evaluation.

A View on Out-of-Distribution Identification from a Statistical Testing Theory Perspective

TL;DR

Abstract

Paper Structure (13 sections, 9 theorems, 32 equations, 3 figures)

This paper contains 13 sections, 9 theorems, 32 equations, 3 figures.

Introduction
Problem Framework
A Wasserstein Distance OOD Test
OOD Identifiability and Test Power
Distributional Distance Tests
Experiments
Generative Model Example
MNIST Classification
Conclusion
Proofs of Theorems
Additional Information on Distances
Link between KL and Wasserstein Distance
Simplifications for Gaussian Densities

Key Result

Theorem 3.1

Let $\mathcal{D}_m$ be a test dataset. The test based on $T^{wass}_m = m^{1/2} W_p (P_{\theta}, Q)$ for hypotheses $H_0: D_m \sim P_{\theta}$ vs $H_1: D_m \sim Q \neq P_{\theta}$, is such that as $m \rightarrow \infty$, over alternatives $Q_m$ that satisfy $n^{1/2} W(P_{\theta}, Q_m) \geq \Delta_m$, where $\lim_{m \rightarrow \infty} \Delta_m = \infty$.

Figures (3)

Figure 1: Examples of a discrete distribution shifts where KL and JS divergences offer a less informative measure, while $W(P,Q)$ is able to capture that the shift on the right is geometrically much further apart from the reference distribution than the one on the left.
Figure 2: The plot on the left depicts the latent factors $(Z_1, Z_2)$ true distribution, learnt distribution via FL, and OOD one. The plot on the right reports the mean AUROC of each OOD tests (with 90% error bands) for number of standard deviations from the ID mean.
Figure 3: First plot on the left shows samples from MNIST (ID) and Fashion MNIST (OOD) datasets. Centre plot shows the distribution of the two principal latent factors, computed with Truncated SVD, on MNIST and Fashion MNIST. Table on the right reports results in terms of AUROC, TPR and FPR of the four OOD test considered.

Theorems & Definitions (14)

Remark 2.2: OOD Test
Theorem 3.1: Uniform Consistency
Theorem 3.2: Non-Asymptotic Lower Bound
Theorem 3.3: Worst Case Upper Bound
Theorem 3.4: Intermediate Case Asymptotic Upper Bound
Theorem A.1: Restatement of Theorem \ref{['thm:consist']}
proof
Theorem A.2: Restatement of Theorem \ref{['thm:3.2']}
proof
Theorem A.3: bolley2007quantitative
...and 4 more

A View on Out-of-Distribution Identification from a Statistical Testing Theory Perspective

TL;DR

Abstract

A View on Out-of-Distribution Identification from a Statistical Testing Theory Perspective

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (14)