Statistical Test for Anomaly Detections by Variational Auto-Encoders
Daiki Miwa, Tomohiro Shiraishi, Vo Nguyen Le Duy, Teruyuki Katsuoka, Ichiro Takeuchi
TL;DR
The paper addresses the reliability of VAE-based anomaly detection in high-stakes settings by introducing the VAE-AD Test, a selective-inference–based statistical test that outputs p-values for detected anomalous regions. It treats the anomaly region as a data-driven hypothesis and derives p-values from a truncated normal distribution conditional on the region selection, leveraging the VAE’s piecewise-linear structure. The method provides finite-sample validity and demonstrates improved Type I error control and higher power relative to baselines in synthetic and brain-imaging experiments. This approach enhances trust in deep learning–based anomaly localization by quantifying statistical reliability and enabling controlled decisions in medical imaging contexts.
Abstract
In this study, we consider the reliability assessment of anomaly detection (AD) using Variational Autoencoder (VAE). Over the last decade, VAE-based AD has been actively studied in various perspective, from method development to applied research. However, when the results of ADs are used in high-stakes decision-making, such as in medical diagnosis, it is necessary to ensure the reliability of the detected anomalies. In this study, we propose the VAE-AD Test as a method for quantifying the statistical reliability of VAE-based AD within the framework of statistical testing. Using the VAE-AD Test, the reliability of the anomaly regions detected by a VAE can be quantified in the form of p-values. This means that if an anomaly is declared when the p-value is below a certain threshold, it is possible to control the probability of false detection to a desired level. Since the VAE-AD Test is constructed based on a new statistical inference framework called selective inference, its validity is theoretically guaranteed in finite samples. To demonstrate the validity and effectiveness of the proposed VAE-AD Test, numerical experiments on artificial data and applications to brain image analysis are conducted.
