Table of Contents
Fetching ...

Unseen Visual Anomaly Generation

Han Sun, Yunkang Cao, Hao Dong, Olga Fink

TL;DR

This paper tackles the scarcity of anomalous data in visual anomaly detection by proposing AnomalyAny, a training-free framework that generates unseen anomalies using a pretrained Stable Diffusion model conditioned at test time on a normal sample. It couples test-time normal sample conditioning with attention-guided anomaly optimization and prompt-guided refinement to produce diverse, authentic anomalies for arbitrary objects. The approach is validated on MVTec AD and VisA, showing improved generation quality and meaningful gains in downstream anomaly detection under 1-shot and few-shot regimes. By enabling universal anomaly generation without data-specific training, AnomalyAny holds practical potential for enhancing AD systems and advancing foundation-model-based anomaly detection.

Abstract

Visual anomaly detection (AD) presents significant challenges due to the scarcity of anomalous data samples. While numerous works have been proposed to synthesize anomalous samples, these synthetic anomalies often lack authenticity or require extensive training data, limiting their applicability in real-world scenarios. In this work, we propose Anomaly Anything (AnomalyAny), a novel framework that leverages Stable Diffusion (SD)'s image generation capabilities to generate diverse and realistic unseen anomalies. By conditioning on a single normal sample during test time, AnomalyAny is able to generate unseen anomalies for arbitrary object types with text descriptions. Within AnomalyAny, we propose attention-guided anomaly optimization to direct SD attention on generating hard anomaly concepts. Additionally, we introduce prompt-guided anomaly refinement, incorporating detailed descriptions to further improve the generation quality. Extensive experiments on MVTec AD and VisA datasets demonstrate AnomalyAny's ability in generating high-quality unseen anomalies and its effectiveness in enhancing downstream AD performance.

Unseen Visual Anomaly Generation

TL;DR

This paper tackles the scarcity of anomalous data in visual anomaly detection by proposing AnomalyAny, a training-free framework that generates unseen anomalies using a pretrained Stable Diffusion model conditioned at test time on a normal sample. It couples test-time normal sample conditioning with attention-guided anomaly optimization and prompt-guided refinement to produce diverse, authentic anomalies for arbitrary objects. The approach is validated on MVTec AD and VisA, showing improved generation quality and meaningful gains in downstream anomaly detection under 1-shot and few-shot regimes. By enabling universal anomaly generation without data-specific training, AnomalyAny holds practical potential for enhancing AD systems and advancing foundation-model-based anomaly detection.

Abstract

Visual anomaly detection (AD) presents significant challenges due to the scarcity of anomalous data samples. While numerous works have been proposed to synthesize anomalous samples, these synthetic anomalies often lack authenticity or require extensive training data, limiting their applicability in real-world scenarios. In this work, we propose Anomaly Anything (AnomalyAny), a novel framework that leverages Stable Diffusion (SD)'s image generation capabilities to generate diverse and realistic unseen anomalies. By conditioning on a single normal sample during test time, AnomalyAny is able to generate unseen anomalies for arbitrary object types with text descriptions. Within AnomalyAny, we propose attention-guided anomaly optimization to direct SD attention on generating hard anomaly concepts. Additionally, we introduce prompt-guided anomaly refinement, incorporating detailed descriptions to further improve the generation quality. Extensive experiments on MVTec AD and VisA datasets demonstrate AnomalyAny's ability in generating high-quality unseen anomalies and its effectiveness in enhancing downstream AD performance.
Paper Structure (22 sections, 13 equations, 17 figures, 16 tables)

This paper contains 22 sections, 13 equations, 17 figures, 16 tables.

Figures (17)

  • Figure 1: Comparison between different visual anomaly generation methods. In comparison to existing methods, AnomalyAny can generate diverse and realistic unseen anomalies without training.
  • Figure 2: Illustration of AnomalyAny with details of the attention-guided & prompt-guided optimization process at time step $t$.
  • Figure 3: Examples of generated anomaly samples and attention map of damage. We present (a) Normal guidance image, and the results generated by (b) Stable Diffusion, (c) Ours w/o normal sample conditioning, (d) Ours w/o attention-guided optimization, (e) Ours w/o prompt-guided optimization, and (f) Our proposed AnomalyAny.
  • Figure 4: Visualization of intermediate generation results and the attention maps of anomaly tokens at different denoising steps.
  • Figure 5: Examples of anomalies generated after attention-based optimization on anomaly tokens (a) w/o and (b) w/ localization-aware scheduler.
  • ...and 12 more figures