Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data

Clement Fung; Chen Qiu; Aodong Li; Maja Rudolph

Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data

Clement Fung, Chen Qiu, Aodong Li, Maja Rudolph

TL;DR

This work tackles the problem of selecting anomaly detectors in the absence of labeled validation data by introducing SWSA, a general framework that builds synthetic validation sets from a small normal support set using two training-free anomaly-generation methods: CutPaste and diffusion-based DiffStyle. By evaluating candidate detectors and CLIP prompts on these synthetic sets, SWSA achieves model and prompt selections that often align with selections based on ground-truth validation data, outperforming baselines in several natural-domain tasks and enabling zero-shot CLIP-based anomaly detection without labeled validation. The study provides extensive empirical results across four datasets and 329 tasks, revealing that diffusion-based synthetic anomalies are particularly effective for ranking models in natural image domains, while CutPaste can be advantageous for fine-grained industrial defects; it also analyzes theoretical bounds via total variation, finding no tight guarantees in general. Overall, SWSA offers a scalable, training-free approach to deploy and adapt anomaly detectors to new domains where labeled validation data are scarce or unavailable, with practical implications for rapid model and prompt selection in real-world settings.

Abstract

Anomaly detection is the task of identifying abnormal samples in large unlabeled datasets. While the advent of foundation models has produced powerful zero-shot anomaly detection methods, their deployment in practice is often hindered by the absence of labeled validation data -- without it, their detection performance cannot be evaluated reliably. In this work, we propose SWSA (Selection With Synthetic Anomalies): a general-purpose framework to select image-based anomaly detectors without labeled validation data. Instead of collecting labeled validation data, we generate synthetic anomalies without any training or fine-tuning, using only a small support set of normal images. Our synthetic anomalies are used to create detection tasks that compose a validation framework for model selection. In an empirical study, we evaluate SWSA with three types of synthetic anomalies and on two selection tasks: model selection of image-based anomaly detectors and prompt selection for CLIP-based anomaly detection. SWSA often selects models and prompts that match selections made with a ground-truth validation set, outperforming baseline selection strategies.

Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data

TL;DR

Abstract

Paper Structure (29 sections, 4 equations, 4 figures, 3 tables)

This paper contains 29 sections, 4 equations, 4 figures, 3 tables.

Introduction
Related Work
Unsupervised anomaly detection.
Anomaly detection with foundation models.
Meta-evaluation of anomaly detection.
Guided image synthesis.
Method
Model Selection with Synthetic Anomalies
Generating Synthetic Anomalies
CutPaste.
Diffusion-based generation.
Empirical Study
Experimental Setup
Datasets.
Anomaly detection tasks.
...and 14 more sections

Figures (4)

Figure 1: We propose two methods for generating synthetic anomalies, as described in \ref{['sec:anomaly-generation']}: image-guided generation with a diffusion model and local augmentation. We produce a synthetic validation set by combining real normal images with synthetic anomalies. The synthetic validation set is then used for model selection, as described in \ref{['sec:anomaly-use']}. Components in blue are frozen, components in green are real data, and components in orange are methods implemented in this work.
Figure 2: Examples of synthetic anomalies generated with our diffusion-based method for CUB class 1 (left) and MVTec-AD "cable" (right). For each example, the top row of images (in green) are used as source "style" images, and the left column of images (in cyan) are used as source "content" images. The inner grid (in red) shows each pairwise interpolation between the source style and content image, performed with our modified DiffStyle process. All source images are drawn from the distribution of class 1 support images; no validation data or images from other classes are used.
Figure 3: To evaluate SWSA for model ranking, we compare the real and synthetic validation AUROC for all models when using our three candidate synthetic validation sets (Tiny-Imagenet, our diffusion-based anomalies, and our Cutpaste-based anomalies). SWSA performs best when ranking models with diffusion-based anomalies in the one-vs-rest anomaly detection setting on datasets with natural variation (i.e., CUB and Flowers). For a quantitative evaluation, we provide Kendall's Tau rank correlation values in \ref{['table:kendallrank']}.
Figure 4: For each anomaly detection task, the ViT-B-16 embeddings of in-class images (orange triangle), diffusion-generated anomalies (black circle), Cutpaste-generated anomalies (green square), and real one-vs-closest anomalies (blue triangle) are shown. When anomalies come from natural variations between classes (CUB and Flowers), they are better represented by diffusion-based anomalies. When anomalous images come from local changes, they are better represented by Cutpaste-based anomalies.

Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data

TL;DR

Abstract

Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data

Authors

TL;DR

Abstract

Table of Contents

Figures (4)