Table of Contents
Fetching ...

A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation

Hanxi Li, Zhengxun Zhang, Hao Chen, Lin Wu, Bo Li, Deyin Liu, Mingwen Wang

TL;DR

This work tackles the shortage of defective samples in industrial anomaly detection by introducing AdaBLDM, a defect-augmented diffusion framework that tailors a Blended Latent Diffusion Model with defect-specific controls. Key innovations include a defect trimap as spatial guidance, cross-modal linguistic prompts, and a three-stage denoising inference with latent and pixel-level content editing, plus online decoder adaptation to refine realism. The approach yields state-of-the-art AD performance on MVTec AD with augmented data (gains of approximately 1.5 percentage points in AP, 1.9 in IAP, and 3.1 in IAP90) and strong results on BTAD and KSDD2, outperforming GAN- and past diffusion-based methods. Empirically, AdaBLDM provides more reliable defect generation with better region alignment, enabling practical improvements in industrial AD training and deployment.

Abstract

Effectively addressing the challenge of industrial Anomaly Detection (AD) necessitates an ample supply of defective samples, a constraint often hindered by their scarcity in industrial contexts. This paper introduces a novel algorithm designed to augment defective samples, thereby enhancing AD performance. The proposed method tailors the blended latent diffusion model for defect sample generation, employing a diffusion model to generate defective samples in the latent space. A feature editing process, controlled by a ``trimap" mask and text prompts, refines the generated samples. The image generation inference process is structured into three stages: a free diffusion stage, an editing diffusion stage, and an online decoder adaptation stage. This sophisticated inference strategy yields high-quality synthetic defective samples with diverse pattern variations, leading to significantly improved AD accuracies based on the augmented training set. Specifically, on the widely recognized MVTec AD dataset, the proposed method elevates the state-of-the-art (SOTA) performance of AD with augmented data by 1.5%, 1.9%, and 3.1% for AD metrics AP, IAP, and IAP90, respectively. The implementation code of this work can be found at the GitHub repository https://github.com/GrandpaXun242/AdaBLDM.git

A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation

TL;DR

This work tackles the shortage of defective samples in industrial anomaly detection by introducing AdaBLDM, a defect-augmented diffusion framework that tailors a Blended Latent Diffusion Model with defect-specific controls. Key innovations include a defect trimap as spatial guidance, cross-modal linguistic prompts, and a three-stage denoising inference with latent and pixel-level content editing, plus online decoder adaptation to refine realism. The approach yields state-of-the-art AD performance on MVTec AD with augmented data (gains of approximately 1.5 percentage points in AP, 1.9 in IAP, and 3.1 in IAP90) and strong results on BTAD and KSDD2, outperforming GAN- and past diffusion-based methods. Empirically, AdaBLDM provides more reliable defect generation with better region alignment, enabling practical improvements in industrial AD training and deployment.

Abstract

Effectively addressing the challenge of industrial Anomaly Detection (AD) necessitates an ample supply of defective samples, a constraint often hindered by their scarcity in industrial contexts. This paper introduces a novel algorithm designed to augment defective samples, thereby enhancing AD performance. The proposed method tailors the blended latent diffusion model for defect sample generation, employing a diffusion model to generate defective samples in the latent space. A feature editing process, controlled by a ``trimap" mask and text prompts, refines the generated samples. The image generation inference process is structured into three stages: a free diffusion stage, an editing diffusion stage, and an online decoder adaptation stage. This sophisticated inference strategy yields high-quality synthetic defective samples with diverse pattern variations, leading to significantly improved AD accuracies based on the augmented training set. Specifically, on the widely recognized MVTec AD dataset, the proposed method elevates the state-of-the-art (SOTA) performance of AD with augmented data by 1.5%, 1.9%, and 3.1% for AD metrics AP, IAP, and IAP90, respectively. The implementation code of this work can be found at the GitHub repository https://github.com/GrandpaXun242/AdaBLDM.git
Paper Structure (29 sections, 8 equations, 7 figures, 6 tables, 2 algorithms)

This paper contains 29 sections, 8 equations, 7 figures, 6 tables, 2 algorithms.

Figures (7)

  • Figure 1: Illustration of three defect generation styles. From top to bottom: conventional approaches, GAN-based algorithms, and the proposed method.
  • Figure 2: The network structure of the proposed BLDM-based method for generating defective regions on a image. One can see besides the noise inupt, the model is governed by a a text prompt and a trimap that indicates the locations of the object and defect
  • Figure 3: The inference scheme of the proposed AdaBLDM algorithm. One can see that the whole procedure can be mainly divided into $4$ stages, namely the pure denoising stage without editing; the latent editing stage; the image editing stage, and the decoder adaptation stage.
  • Figure 4: The genuine and synthetic defective samples on MVTec AD dataset. From top to bottom, the samples with genuine defects; the samples generated by using DCDGANc-StarGAN; the samples generated by using DCDGANc-StyleGAN; the samples generated with DFM and the samples generated by using our method.
  • Figure 5: The fine-grained comparison in the generation quality on the Hazelnut subcategory of MVTec AD. Different kinds of defects are mimicked by the involved generation algorithms and their performances can be compared on a more detailed level.
  • ...and 2 more figures