Table of Contents
Fetching ...

Correcting Deviations from Normality: A Reformulated Diffusion Model for Multi-Class Unsupervised Anomaly Detection

Farzad Beizaee, Gregory A. Lodygensky, Christian Desrosiers, Jose Dolz

TL;DR

The paper tackles unsupervised multi-class anomaly detection by addressing diffusion-model limitations that uniformly perturb whole images and degrade normal content. It introduces DeCo-Diff, a deviation-correcting diffusion framework that operates in a latent space via a masked forward process and a Direction of Deviation predictor, using DDIM-inspired inference to selectively restore normal content while preserving normal regions. Multi-scale discrepancies across pixel and latent spaces are fused with a geometric mean to produce precise anomaly localization. Empirical results on MVTec-AD, VisA, and additional datasets show state-of-the-art performance in both image-level detection and pixel-level localization, with substantial improvements in AUPRC and AUPRO, and verified qualitative visualizations of accurate anomaly correction.

Abstract

Recent advances in diffusion models have spurred research into their application for Reconstruction-based unsupervised anomaly detection. However, these methods may struggle with maintaining structural integrity and recovering the anomaly-free content of abnormal regions, especially in multi-class scenarios. Furthermore, diffusion models are inherently designed to generate images from pure noise and struggle to selectively alter anomalous regions of an image while preserving normal ones. This leads to potential degradation of normal regions during reconstruction, hampering the effectiveness of anomaly detection. This paper introduces a reformulation of the standard diffusion model geared toward selective region alteration, allowing the accurate identification of anomalies. By modeling anomalies as noise in the latent space, our proposed Deviation correction diffusion (DeCo-Diff) model preserves the normal regions and encourages transformations exclusively on anomalous areas. This selective approach enhances the reconstruction quality, facilitating effective unsupervised detection and localization of anomaly regions. Comprehensive evaluations demonstrate the superiority of our method in accurately identifying and localizing anomalies in complex images, with pixel-level AUPRC improvements of 11-14% over state-of-the-art models on well known anomaly detection datasets. The code is available at https://github.com/farzad-bz/DeCo-Diff

Correcting Deviations from Normality: A Reformulated Diffusion Model for Multi-Class Unsupervised Anomaly Detection

TL;DR

The paper tackles unsupervised multi-class anomaly detection by addressing diffusion-model limitations that uniformly perturb whole images and degrade normal content. It introduces DeCo-Diff, a deviation-correcting diffusion framework that operates in a latent space via a masked forward process and a Direction of Deviation predictor, using DDIM-inspired inference to selectively restore normal content while preserving normal regions. Multi-scale discrepancies across pixel and latent spaces are fused with a geometric mean to produce precise anomaly localization. Empirical results on MVTec-AD, VisA, and additional datasets show state-of-the-art performance in both image-level detection and pixel-level localization, with substantial improvements in AUPRC and AUPRO, and verified qualitative visualizations of accurate anomaly correction.

Abstract

Recent advances in diffusion models have spurred research into their application for Reconstruction-based unsupervised anomaly detection. However, these methods may struggle with maintaining structural integrity and recovering the anomaly-free content of abnormal regions, especially in multi-class scenarios. Furthermore, diffusion models are inherently designed to generate images from pure noise and struggle to selectively alter anomalous regions of an image while preserving normal ones. This leads to potential degradation of normal regions during reconstruction, hampering the effectiveness of anomaly detection. This paper introduces a reformulation of the standard diffusion model geared toward selective region alteration, allowing the accurate identification of anomalies. By modeling anomalies as noise in the latent space, our proposed Deviation correction diffusion (DeCo-Diff) model preserves the normal regions and encourages transformations exclusively on anomalous areas. This selective approach enhances the reconstruction quality, facilitating effective unsupervised detection and localization of anomaly regions. Comprehensive evaluations demonstrate the superiority of our method in accurately identifying and localizing anomalies in complex images, with pixel-level AUPRC improvements of 11-14% over state-of-the-art models on well known anomaly detection datasets. The code is available at https://github.com/farzad-bz/DeCo-Diff

Paper Structure

This paper contains 26 sections, 15 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: Diffusion model reconstruction vs. DeCo-Diff. Fine details and patterns of a normal image are changed during the standard forward-backward diffusion process: the "wood" image becomes a "Tile" sample when $T$ steps are applied. In contrast, DeCo-Diff does not alter the details of the image using $T$ correction steps, maintaining the appearance of the original input image.
  • Figure 1: Additional qualitative results on MVTec-AD dataset. From top to bottom: the original input image (with anomalies), DeCo-Diff reconstruction, the ground truth mask, and the predicted anomaly mask across different objects of MVTec-AD dataset.
  • Figure 2: Overview of the proposed method. During training (top), normal images are partially diffused using random masks and randomly sampled time-steps ([1,$T$]). Then, our DeCo-Diff model is trained to predict the direction of deviation from the input image. At inference (bottom) starting from time-step $T$ for the target images, DeCo-Diff progressively corrects the deviation from normality.
  • Figure 2: Additional qualitative results on VisA dataset. From top to bottom: the original input image (with anomalies), DeCo-Diff reconstruction, the ground truth mask, and the predicted anomaly mask across different objects of VisA dataset.
  • Figure 3: Qualitative results. From top to bottom: the original input image (with anomalies), DeCo-Diff reconstruction, the ground truth mask, and the predicted anomaly mask. Examples are depicted for two datasets (MVTec-AD on the left side and VisA on the right side) and across multiple anomalies with diverse complexity.
  • ...and 3 more figures