Table of Contents
Fetching ...

Attention Modules Improve Modern Image-Level Anomaly Detection: A DifferNet Case Study

André Luiz B. Vieira e Silva, Francisco Simões, Danny Kowerko, Tobias Schlosser, Felipe Battisti, Veronica Teichrieb

TL;DR

The paper addresses the challenge of unsupervised anomaly detection in industrial visual inspection, particularly under in-the-wild conditions where defective samples are scarce. It introduces AttentDifferNet, which augments DifferNet with modular attention blocks (SENet or CBAM) to improve feature embeddings via depth-aware attention. Across InsPLAD-fault, MVTec AD, and Semiconductor Wafer, AttentDifferNet consistently outperforms standard DifferNet, achieving higher AUROC scores and, in some cases, state-of-the-art results on in-the-wild data. The work demonstrates that lightweight attention modules can significantly enhance image-level anomaly detection with normalizing-flow-based embeddings, suggesting broad applicability in practical industrial inspection scenarios.

Abstract

Within (semi-)automated visual inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To not only alleviate this issue but to furthermore advance the current state of the art in unsupervised visual inspection, this contribution proposes a DifferNet-based solution enhanced with attention modules utilizing SENet and CBAM as backbone - AttentDifferNet - to improve the detection and classification capabilities on three different visual inspection and anomaly detection datasets: MVTec AD, InsPLAD-fault, and Semiconductor Wafer. In comparison to the current state of the art, it is shown that AttentDifferNet achieves improved results, which are, in turn, highlighted throughout our quantitative as well as qualitative evaluation, indicated by a general improvement in AUC of 94.34 vs. 92.46, 96.67 vs. 94.69, and 90.20 vs. 88.74%. As our variants to AttentDifferNet show great prospects in the context of currently investigated approaches, a baseline is formulated, emphasizing the importance of attention for anomaly detection.

Attention Modules Improve Modern Image-Level Anomaly Detection: A DifferNet Case Study

TL;DR

The paper addresses the challenge of unsupervised anomaly detection in industrial visual inspection, particularly under in-the-wild conditions where defective samples are scarce. It introduces AttentDifferNet, which augments DifferNet with modular attention blocks (SENet or CBAM) to improve feature embeddings via depth-aware attention. Across InsPLAD-fault, MVTec AD, and Semiconductor Wafer, AttentDifferNet consistently outperforms standard DifferNet, achieving higher AUROC scores and, in some cases, state-of-the-art results on in-the-wild data. The work demonstrates that lightweight attention modules can significantly enhance image-level anomaly detection with normalizing-flow-based embeddings, suggesting broad applicability in practical industrial inspection scenarios.

Abstract

Within (semi-)automated visual inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To not only alleviate this issue but to furthermore advance the current state of the art in unsupervised visual inspection, this contribution proposes a DifferNet-based solution enhanced with attention modules utilizing SENet and CBAM as backbone - AttentDifferNet - to improve the detection and classification capabilities on three different visual inspection and anomaly detection datasets: MVTec AD, InsPLAD-fault, and Semiconductor Wafer. In comparison to the current state of the art, it is shown that AttentDifferNet achieves improved results, which are, in turn, highlighted throughout our quantitative as well as qualitative evaluation, indicated by a general improvement in AUC of 94.34 vs. 92.46, 96.67 vs. 94.69, and 90.20 vs. 88.74%. As our variants to AttentDifferNet show great prospects in the context of currently investigated approaches, a baseline is formulated, emphasizing the importance of attention for anomaly detection.
Paper Structure (6 sections, 3 figures, 4 tables)

This paper contains 6 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Proposed AttentDifferNet architecture.
  • Figure 2: InsPLAD samples. The first row shows flawless assets, while the second shows defective ones. From left to right: Glass Insulator, Lightning Rod Suspension, Polymer Insulator Upper Shackle, Vari-grip, and Yoke Suspension.
  • Figure 3: Exemplary Grad-CAM-based class activation mapping comparison for DifferNet vs. AttentDifferNet given two categories from InsPLAD-fault (Glass Insulator and Vari-grip) and two from MVTec AD (Capsule and Grid), respectively.