Table of Contents
Fetching ...

Improving Anomaly Detection with Foundation-Model Synthesis and Wavelet-Domain Attention

Wensheng Wu, Zheming Lu, Ziqian Lu, Zewei He, Xuecheng Sun, Zhao Wang, Jungong Han, Yunlong Yu

TL;DR

A foundation model-based anomaly synthesis pipeline (FMAS) that generates highly realistic anomalous samples without fine-tuning or class-specific training is proposed and aWavelet Domain Attention Module (WDAM) is introduced, which exploits adaptive sub-band processing to enhance anomaly feature extraction.

Abstract

Industrial anomaly detection faces significant challenges due to the scarcity of anomalous samples and the complexity of real-world anomalies. In this paper, we propose a foundation model-based anomaly synthesis pipeline (FMAS) that generates highly realistic anomalous samples without fine-tuning or class-specific training. Motivated by the distinct frequency-domain characteristics of anomalies, we introduce aWavelet Domain Attention Module (WDAM), which exploits adaptive sub-band processing to enhance anomaly feature extraction. The combination of FMAS and WDAM significantly improves anomaly detection sensitivity while maintaining computational efficiency. Comprehensive experiments on MVTec AD and VisA datasets demonstrate that WDAM, as a plug-and-play module, achieves substantial performance gains against existing baselines.

Improving Anomaly Detection with Foundation-Model Synthesis and Wavelet-Domain Attention

TL;DR

A foundation model-based anomaly synthesis pipeline (FMAS) that generates highly realistic anomalous samples without fine-tuning or class-specific training is proposed and aWavelet Domain Attention Module (WDAM) is introduced, which exploits adaptive sub-band processing to enhance anomaly feature extraction.

Abstract

Industrial anomaly detection faces significant challenges due to the scarcity of anomalous samples and the complexity of real-world anomalies. In this paper, we propose a foundation model-based anomaly synthesis pipeline (FMAS) that generates highly realistic anomalous samples without fine-tuning or class-specific training. Motivated by the distinct frequency-domain characteristics of anomalies, we introduce aWavelet Domain Attention Module (WDAM), which exploits adaptive sub-band processing to enhance anomaly feature extraction. The combination of FMAS and WDAM significantly improves anomaly detection sensitivity while maintaining computational efficiency. Comprehensive experiments on MVTec AD and VisA datasets demonstrate that WDAM, as a plug-and-play module, achieves substantial performance gains against existing baselines.
Paper Structure (16 sections, 9 equations, 7 figures, 10 tables)

This paper contains 16 sections, 9 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: Qualitative results showing anomaly samples generated by the proposed FMAS on the MVTec AD dataset, with red contours highlighting the anomalous regions.
  • Figure 2: Visualization of anomaly saliency in the four wavelet sub-bands. The leftmost column presents the original images, the middle columns show the LL, LH, HL, and HH sub-bands, and the rightmost column illustrates the corresponding masks. Red dashed boxes indicate defect locations, and red solid boxes highlight magnified regions. All samples are drawn from the MVTec AD dataset.
  • Figure 3: Illustration of the proposed foundation model-based anomaly synthesis pipeline (FMAS).
  • Figure 4: The proposed Wavelet Domain Attention Module (WDAM) processes input feature maps through a frequency-aware attention mechanism. First, WDAM decomposes the spatial features into multiple frequency sub-bands using Discrete Wavelet Transform (DWT). An adaptive attention mechanism then selectively enhances or suppresses features in each sub-band based on their relevance to anomaly detection. Finally, the refined frequency components are reconstructed back into the spatial domain via Inverse Discrete Wavelet Transform (IDWT), effectively preserving critical defect-related patterns while suppressing noise.
  • Figure 5: (a) The architecture of CutPaste with the proposed WDAM, wherein Bottleneck* denotes a modified bottleneck block incorporating WDAM, as elaborated in (b). The elements enclosed within the red dashed box indicate the additional components introduced in comparison to the original bottleneck configuration.
  • ...and 2 more figures