Table of Contents
Fetching ...

Context Enhancement with Reconstruction as Sequence for Unified Unsupervised Anomaly Detection

Hui-Yue Yang, Hui Chen, Lihao Liu, Zijia Lin, Kai Chen, Liejun Wang, Jungong Han, Guiguang Ding

TL;DR

This work introduces a novel Reconstruction as Sequence (RAS) method, which enhances the contextual correspondence during feature reconstruction from a sequence modeling perspective, based on the transformer technique, and integrates a specialized RASFormer block into RAS.

Abstract

Unsupervised anomaly detection (AD) aims to train robust detection models using only normal samples, while can generalize well to unseen anomalies. Recent research focuses on a unified unsupervised AD setting in which only one model is trained for all classes, i.e., n-class-one-model paradigm. Feature-reconstruction-based methods achieve state-of-the-art performance in this scenario. However, existing methods often suffer from a lack of sufficient contextual awareness, thereby compromising the quality of the reconstruction. To address this issue, we introduce a novel Reconstruction as Sequence (RAS) method, which enhances the contextual correspondence during feature reconstruction from a sequence modeling perspective. In particular, based on the transformer technique, we integrate a specialized RASFormer block into RAS. This block enables the capture of spatial relationships among different image regions and enhances sequential dependencies throughout the reconstruction process. By incorporating the RASFormer block, our RAS method achieves superior contextual awareness capabilities, leading to remarkable performance. Experimental results show that our RAS significantly outperforms competing methods, well demonstrating the effectiveness and superiority of our method. Our code is available at https://github.com/Nothingtolose9979/RAS.

Context Enhancement with Reconstruction as Sequence for Unified Unsupervised Anomaly Detection

TL;DR

This work introduces a novel Reconstruction as Sequence (RAS) method, which enhances the contextual correspondence during feature reconstruction from a sequence modeling perspective, based on the transformer technique, and integrates a specialized RASFormer block into RAS.

Abstract

Unsupervised anomaly detection (AD) aims to train robust detection models using only normal samples, while can generalize well to unseen anomalies. Recent research focuses on a unified unsupervised AD setting in which only one model is trained for all classes, i.e., n-class-one-model paradigm. Feature-reconstruction-based methods achieve state-of-the-art performance in this scenario. However, existing methods often suffer from a lack of sufficient contextual awareness, thereby compromising the quality of the reconstruction. To address this issue, we introduce a novel Reconstruction as Sequence (RAS) method, which enhances the contextual correspondence during feature reconstruction from a sequence modeling perspective. In particular, based on the transformer technique, we integrate a specialized RASFormer block into RAS. This block enables the capture of spatial relationships among different image regions and enhances sequential dependencies throughout the reconstruction process. By incorporating the RASFormer block, our RAS method achieves superior contextual awareness capabilities, leading to remarkable performance. Experimental results show that our RAS significantly outperforms competing methods, well demonstrating the effectiveness and superiority of our method. Our code is available at https://github.com/Nothingtolose9979/RAS.
Paper Structure (12 sections, 13 equations, 6 figures, 6 tables)

This paper contains 12 sections, 13 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Comparison of different paradigms in anomaly detection. Left: n-class-n-model paradigm, where separate models are trained for each class. Right: n-class-one-model paradigm, utilizing a unified model to detect anomalies across all classes.
  • Figure 2: Top: inspection of the reconstruction failure of UniAD. Bottom: illustration of the superior reconstruction quality of our proposed RAS method. $\bm{I}$ is an anomalous metal nut.
  • Figure 3: Overview of the proposed RAS framework for the unified unsupervised anomaly detection. We enhance the contextual awareness capability during feature reconstruction via a specially designed RASFormer block. The uppermost panel depicts the pipeline, while the two boxes below illustrate the detailed architecture of encoder-decoder to perform feature reconstruction.
  • Figure 4: Qualitative results for anomaly map on MVTec-AD. We turn the anomaly map into the heat map for better visualization. Regions with higher anomaly scores are depicted in vibrant red colors. Best viewed in colors. "GT" means the ground truth.
  • Figure 5: Visualization comparison of image reconstruction. We utilize bounding boxes to visually differentiate between the worse (red) and better (green) regions.
  • ...and 1 more figures