Model Selection of Anomaly Detectors in the Absence of Labeled Validation Data
Clement Fung, Chen Qiu, Aodong Li, Maja Rudolph
TL;DR
This work tackles the problem of selecting anomaly detectors in the absence of labeled validation data by introducing SWSA, a general framework that builds synthetic validation sets from a small normal support set using two training-free anomaly-generation methods: CutPaste and diffusion-based DiffStyle. By evaluating candidate detectors and CLIP prompts on these synthetic sets, SWSA achieves model and prompt selections that often align with selections based on ground-truth validation data, outperforming baselines in several natural-domain tasks and enabling zero-shot CLIP-based anomaly detection without labeled validation. The study provides extensive empirical results across four datasets and 329 tasks, revealing that diffusion-based synthetic anomalies are particularly effective for ranking models in natural image domains, while CutPaste can be advantageous for fine-grained industrial defects; it also analyzes theoretical bounds via total variation, finding no tight guarantees in general. Overall, SWSA offers a scalable, training-free approach to deploy and adapt anomaly detectors to new domains where labeled validation data are scarce or unavailable, with practical implications for rapid model and prompt selection in real-world settings.
Abstract
Anomaly detection is the task of identifying abnormal samples in large unlabeled datasets. While the advent of foundation models has produced powerful zero-shot anomaly detection methods, their deployment in practice is often hindered by the absence of labeled validation data -- without it, their detection performance cannot be evaluated reliably. In this work, we propose SWSA (Selection With Synthetic Anomalies): a general-purpose framework to select image-based anomaly detectors without labeled validation data. Instead of collecting labeled validation data, we generate synthetic anomalies without any training or fine-tuning, using only a small support set of normal images. Our synthetic anomalies are used to create detection tasks that compose a validation framework for model selection. In an empirical study, we evaluate SWSA with three types of synthetic anomalies and on two selection tasks: model selection of image-based anomaly detectors and prompt selection for CLIP-based anomaly detection. SWSA often selects models and prompts that match selections made with a ground-truth validation set, outperforming baseline selection strategies.
