Table of Contents
Fetching ...

Quantifying Statistical Significance of Deep Nearest Neighbor Anomaly Detection via Selective Inference

Mizuki Niihori, Shuichi Nishino, Teruyuki Katsuoka, Tomohiro Shiraishi, Kouichi Taji, Ichiro Takeuchi

TL;DR

This work tackles uncertainty quantification for deep kNN-based anomaly detection in a semi-supervised setting by casting anomaly signaling as a statistical test and applying Selective Inference to obtain valid selective p-values. The method accounts for selection bias from both kNN neighborhood choice and deep feature computations, reducing the problem to a tractable truncated chi-square computation with conditioning. Empirical results on synthetic, tabular, and image data (including MVTec AD) show controlled false positive rates at α = 0.05 and superior power compared to baselines, with an open-source implementation released for reproducibility. The approach provides a practical framework for reliable anomaly detection in industrial and safety-critical contexts by delivering principled significance measures alongside detection scores.

Abstract

In real-world applications, anomaly detection (AD) often operates without access to anomalous data, necessitating semi-supervised methods that rely solely on normal data. Among these methods, deep k-nearest neighbor (deep kNN) AD stands out for its interpretability and flexibility, leveraging distance-based scoring in deep latent spaces.Despite its strong performance, deep kNN lacks a mechanism to quantify uncertainty-an essential feature for critical applications such as industrial inspection. To address this limitation, we propose a statistical framework that quantifies the significance of detected anomalies in the form of p-values, thereby enabling control over false positive rates at a user-specified significance level (e.g.,0.05). A central challenge lies in managing selection bias, which we tackle using Selective Inference-a principled method for conducting inference conditioned on data-driven selections. We evaluate our method on diverse datasets and demonstrate that it provides reliable AD well-suited for industrial use cases.

Quantifying Statistical Significance of Deep Nearest Neighbor Anomaly Detection via Selective Inference

TL;DR

This work tackles uncertainty quantification for deep kNN-based anomaly detection in a semi-supervised setting by casting anomaly signaling as a statistical test and applying Selective Inference to obtain valid selective p-values. The method accounts for selection bias from both kNN neighborhood choice and deep feature computations, reducing the problem to a tractable truncated chi-square computation with conditioning. Empirical results on synthetic, tabular, and image data (including MVTec AD) show controlled false positive rates at α = 0.05 and superior power compared to baselines, with an open-source implementation released for reproducibility. The approach provides a practical framework for reliable anomaly detection in industrial and safety-critical contexts by delivering principled significance measures alongside detection scores.

Abstract

In real-world applications, anomaly detection (AD) often operates without access to anomalous data, necessitating semi-supervised methods that rely solely on normal data. Among these methods, deep k-nearest neighbor (deep kNN) AD stands out for its interpretability and flexibility, leveraging distance-based scoring in deep latent spaces.Despite its strong performance, deep kNN lacks a mechanism to quantify uncertainty-an essential feature for critical applications such as industrial inspection. To address this limitation, we propose a statistical framework that quantifies the significance of detected anomalies in the form of p-values, thereby enabling control over false positive rates at a user-specified significance level (e.g.,0.05). A central challenge lies in managing selection bias, which we tackle using Selective Inference-a principled method for conducting inference conditioned on data-driven selections. We evaluate our method on diverse datasets and demonstrate that it provides reliable AD well-suited for industrial use cases.

Paper Structure

This paper contains 32 sections, 2 theorems, 32 equations, 40 figures.

Key Result

Theorem 4.1

The following conditional test statistic follows a truncated $\chi$ distribution with $(1+n)d$ degrees of freedom, where the truncation is determined by the constraint "$\mathcal{E}_{\bm{Y}} = \mathcal{E}_{\bm{y}}$" and the domain of the distribution is on the one-dimensional subspace defined by $\{ \bm{Y} \mid \mathcal{Q}_{\bm{Y}} = \math

Figures (40)

  • Figure 1: Examples of anomaly patches extracted from Capsule images using $k$NN-based AD are shown (see § \ref{['sec:experiments']} for detailed settings). For both the normal image (left) and the anomaly image (right), two types of $p$-values derived from different statistical tests are presented. The "naive $p$" represents the $p$-value obtained using a conventional method, while the "selective $p$" denotes the $p$-value computed using the method proposed in this study. At a significance level of $\alpha = 0.05$, the conventional naive $p$-value for the normal image (left) falls below the threshold, resulting in a false positive detection. In contrast, the proposed selective $p$-value correctly identifies it as a true negative. For the anomaly image (right), both $p$-values fall below the threshold, correctly identifying the patch as anomalous (true positive). In this study, we show that that conventional naive $p$-values are invalid as measures of statistical significance, whereas the proposed selective $p$-values serve as valid uncertainty measures for assessing the significance of anomalies detected by $k$NN-based ADs.
  • Figure 2: Parametric
  • Figure 3: Semi-Parametric
  • Figure 5: Parametric
  • Figure 6: Semi-Parametric
  • ...and 35 more figures

Theorems & Definitions (2)

  • Theorem 4.1
  • Theorem 4.2