Table of Contents
Fetching ...

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

Teng Hu, Jiangning Zhang, Ran Yi, Yuzhen Du, Xu Chen, Liang Liu, Yabiao Wang, Chengjie Wang

TL;DR

A novel diffusion-based few-shot anomaly generation model, which utilizes the strong prior information of latent diffusion model learned from large-scale dataset to enhance the generation authenticity under few- shot training data and introduces a novel Adaptive Attention Re-weighting Mechanism.

Abstract

Anomaly inspection plays an important role in industrial manufacture. Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data. Although anomaly generation methods have been proposed to augment the anomaly data, they either suffer from poor generation authenticity or inaccurate alignment between the generated anomalies and masks. To address the above problems, we propose AnomalyDiffusion, a novel diffusion-based few-shot anomaly generation model, which utilizes the strong prior information of latent diffusion model learned from large-scale dataset to enhance the generation authenticity under few-shot training data. Firstly, we propose Spatial Anomaly Embedding, which consists of a learnable anomaly embedding and a spatial embedding encoded from an anomaly mask, disentangling the anomaly information into anomaly appearance and location information. Moreover, to improve the alignment between the generated anomalies and the anomaly masks, we introduce a novel Adaptive Attention Re-weighting Mechanism. Based on the disparities between the generated anomaly image and normal sample, it dynamically guides the model to focus more on the areas with less noticeable generated anomalies, enabling generation of accurately-matched anomalous image-mask pairs. Extensive experiments demonstrate that our model significantly outperforms the state-of-the-art methods in generation authenticity and diversity, and effectively improves the performance of downstream anomaly inspection tasks. The code and data are available in https://github.com/sjtuplayer/anomalydiffusion.

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

TL;DR

A novel diffusion-based few-shot anomaly generation model, which utilizes the strong prior information of latent diffusion model learned from large-scale dataset to enhance the generation authenticity under few- shot training data and introduces a novel Adaptive Attention Re-weighting Mechanism.

Abstract

Anomaly inspection plays an important role in industrial manufacture. Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data. Although anomaly generation methods have been proposed to augment the anomaly data, they either suffer from poor generation authenticity or inaccurate alignment between the generated anomalies and masks. To address the above problems, we propose AnomalyDiffusion, a novel diffusion-based few-shot anomaly generation model, which utilizes the strong prior information of latent diffusion model learned from large-scale dataset to enhance the generation authenticity under few-shot training data. Firstly, we propose Spatial Anomaly Embedding, which consists of a learnable anomaly embedding and a spatial embedding encoded from an anomaly mask, disentangling the anomaly information into anomaly appearance and location information. Moreover, to improve the alignment between the generated anomalies and the anomaly masks, we introduce a novel Adaptive Attention Re-weighting Mechanism. Based on the disparities between the generated anomaly image and normal sample, it dynamically guides the model to focus more on the areas with less noticeable generated anomalies, enabling generation of accurately-matched anomalous image-mask pairs. Extensive experiments demonstrate that our model significantly outperforms the state-of-the-art methods in generation authenticity and diversity, and effectively improves the performance of downstream anomaly inspection tasks. The code and data are available in https://github.com/sjtuplayer/anomalydiffusion.
Paper Structure (28 sections, 10 equations, 9 figures, 12 tables)

This paper contains 28 sections, 10 equations, 9 figures, 12 tables.

Figures (9)

  • Figure 1: Top: Our model generates extensive anomaly data, which supports the downstream Anomaly Detection (AD), Localization (AL) and Classification (AC) tasks, while previous methods mainly rely on unsupervised learning or few-shot supervised learning due to the limited anomaly data; Bottom:Generated anomaly results on hazelnut-crack and capsule-squeeze of our model and existing anomaly generation methods, where our results are the most authentic.
  • Figure 2: Overall framework of our AnomalyDiffusion:1) The Spatial Anomaly Embedding$e$, consisting of an anomaly embedding $e_a$(a learned textual embedding to represent anomaly appearance type) and a spatial embedding $e_s$(encoded from an input anomaly mask $m$ to indicate anomaly locations), serves as the text condition to guide the anomaly generation process; 2) The Adaptive Attention Re-weighting Mechanism computes the weight map $w_m$based on the difference between the denoised image $\hat{x}_0$ and the input normal sample $y$, and adaptively reweights the cross-attention map $m_c$ by the weight map $w_m$ to help the model focus more on the less noticeable anomaly areas during the denoising process.
  • Figure 3: Comparison between the models w/ (Ours) and w/o Adaptive Attention Re-weighting (AAR). The model w/o AAR cannot generate anomalies to fill the entire mask.
  • Figure 4: Comparison on the generation results on MVTec. Our model generates high quality anomaly images that are accurately aligned with the anomaly masks.
  • Figure 5: Quantitative anomaly localization comparison with an U-Net trained on the data generated by DRAEM, DFMGAN and our model. It shows that our model achieves the best anomaly localization results.
  • ...and 4 more figures