Table of Contents
Fetching ...

Background-Aware Defect Generation for Robust Industrial Anomaly Detection

Youngjae Cho, Gwangyeol Kim, Sirojbek Safarov, Seongdeok Bang, Jaewoo Park

TL;DR

This work tackles data scarcity in industrial anomaly detection by introducing a background-aware defect generation framework that disentangles defect denoising from the background. It combines a disentanglement loss with DDIM Inversion and masked cross-attention to synthesize contextually accurate defects that respect the target background, including logical anomalies. Theoretical results guarantee background fidelity during defect generation, and extensive experiments on MVTec-AD and MVTec-Loco show superior defect generation quality and improved anomaly detection performance over prior methods. The approach yields better generalization, reduces unrealistic defect syntheses, and provides practical data augmentation benefits for robust inspection systems.

Abstract

Detecting anomalies in industrial settings is challenging due to the scarcity of labeled anomalous data. Generative models can mitigate this issue by synthesizing realistic defect samples, but existing approaches often fail to model the crucial interplay between defects and their background. This oversight leads to unrealistic anomalies, especially in scenarios where contextual consistency is essential (i.e., logical anomaly). To address this, we propose a novel background-aware defect generation framework, where the background influences defect denoising without affecting the background itself by ensuring realistic synthesis while preserving structural integrity. Our method leverages a disentanglement loss to separate the background' s denoising process from the defect, enabling controlled defect synthesis through DDIM Inversion. We theoretically demonstrate that our approach maintains background fidelity while generating contextually accurate defects. Extensive experiments on MVTec AD and MVTec Loco benchmarks validate our mehtod's superiority over existing techniques in both defect generation quality and anomaly detection performance.

Background-Aware Defect Generation for Robust Industrial Anomaly Detection

TL;DR

This work tackles data scarcity in industrial anomaly detection by introducing a background-aware defect generation framework that disentangles defect denoising from the background. It combines a disentanglement loss with DDIM Inversion and masked cross-attention to synthesize contextually accurate defects that respect the target background, including logical anomalies. Theoretical results guarantee background fidelity during defect generation, and extensive experiments on MVTec-AD and MVTec-Loco show superior defect generation quality and improved anomaly detection performance over prior methods. The approach yields better generalization, reduces unrealistic defect syntheses, and provides practical data augmentation benefits for robust inspection systems.

Abstract

Detecting anomalies in industrial settings is challenging due to the scarcity of labeled anomalous data. Generative models can mitigate this issue by synthesizing realistic defect samples, but existing approaches often fail to model the crucial interplay between defects and their background. This oversight leads to unrealistic anomalies, especially in scenarios where contextual consistency is essential (i.e., logical anomaly). To address this, we propose a novel background-aware defect generation framework, where the background influences defect denoising without affecting the background itself by ensuring realistic synthesis while preserving structural integrity. Our method leverages a disentanglement loss to separate the background' s denoising process from the defect, enabling controlled defect synthesis through DDIM Inversion. We theoretically demonstrate that our approach maintains background fidelity while generating contextually accurate defects. Extensive experiments on MVTec AD and MVTec Loco benchmarks validate our mehtod's superiority over existing techniques in both defect generation quality and anomaly detection performance.

Paper Structure

This paper contains 28 sections, 6 theorems, 24 equations, 10 figures, 7 tables, 1 algorithm.

Key Result

Lemma 3.1

Suppose that $\mathcal{L}(\theta^*)=0$, then background of $z_t$ is reconstructed as follows: $(1-m)\odot z_{0}=(1-m)\odot \sqrt{1 \over \alpha_t}[z_t^m -\sqrt{1-\alpha_t} \epsilon_{\theta^*}(z_t^m,t,C_2) ]$

Figures (10)

  • Figure 1: Left image is a comparison between ours and baselines for MVTec AD. Right image is a comparison between ours and baselines for MVTec Loco.
  • Figure 2: Framework of our defect generation. The left image is the overview of the denoising process for latent in U-Net. The right image is the details of cross attention process in U-Net, where we use the masking strategy for disentangling each text embedding.
  • Figure 3: Visualization of synthetic instances given identical target mask. The left side is from Anomalydiffusion and the right side is from ours. The right side mask is refined mask from Eq. \ref{['eq:10']}.
  • Figure 4: Comparison of training loss landscape between ours and baselines in MVTec-AD. The first row is the loss landscapes of training normal sample. The second row is the loss landscapes of training anomaly sample except for synthetic anomalies.
  • Figure 5: Additional defect generation of DFMGAN in MVTec-Ad
  • ...and 5 more figures

Theorems & Definitions (9)

  • Lemma 3.1
  • Theorem 3.2
  • Proposition 3.3
  • Lemma 1.1
  • proof
  • Theorem 1.2
  • proof
  • Proposition 1.3
  • proof