Table of Contents
Fetching ...

Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection

YeongHyeon Park, Sungho Kang, Myung Jin Kim, Hyeong Seok Kim, Juneho Yi

TL;DR

This work addresses incomplete masking in reconstruction-by-inpainting-based unsupervised anomaly detection (UAD) by introducing FADeR, a lightweight two-layer MLP that attenuates defective representations through soft masking on skip connections. Trained with an active-learning strategy that derives patch-wise attenuation signals from reconstruction errors, FADeR does not require labeled defect data. Integrated with a pre-trained UAD model using a single deterministic mask, FADeR improves image- and pixel-level AUROC on MVTec AD and VisA and demonstrates plug-and-play compatibility with other masking approaches. The results show that a compact, edge-friendly module can significantly enhance UAD performance without large-scale architectural changes, enabling practical deployment in industrial settings.

Abstract

In unsupervised anomaly detection (UAD) research, while state-of-the-art models have reached a saturation point with extensive studies on public benchmark datasets, they adopt large-scale tailor-made neural networks (NN) for detection performance or pursued unified models for various tasks. Towards edge computing, it is necessary to develop a computationally efficient and scalable solution that avoids large-scale complex NNs. Motivated by this, we aim to optimize the UAD performance with minimal changes to NN settings. Thus, we revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses. The strength of the SOTA methods is a single deterministic masking approach that addresses the challenges of random multiple masking that is inference latency and output inconsistency. Nevertheless, the issue of failure to provide a mask to completely cover anomalous regions is a remaining weakness. To mitigate this issue, we propose Feature Attenuation of Defective Representation (FADeR) that only employs two MLP layers which attenuates feature information of anomaly reconstruction during decoding. By leveraging FADeR, features of unseen anomaly patterns are reconstructed into seen normal patterns, reducing false alarms. Experimental results demonstrate that FADeR achieves enhanced performance compared to similar-scale NNs. Furthermore, our approach exhibits scalability in performance enhancement when integrated with other single deterministic masking methods in a plug-and-play manner.

Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection

TL;DR

This work addresses incomplete masking in reconstruction-by-inpainting-based unsupervised anomaly detection (UAD) by introducing FADeR, a lightweight two-layer MLP that attenuates defective representations through soft masking on skip connections. Trained with an active-learning strategy that derives patch-wise attenuation signals from reconstruction errors, FADeR does not require labeled defect data. Integrated with a pre-trained UAD model using a single deterministic mask, FADeR improves image- and pixel-level AUROC on MVTec AD and VisA and demonstrates plug-and-play compatibility with other masking approaches. The results show that a compact, edge-friendly module can significantly enhance UAD performance without large-scale architectural changes, enabling practical deployment in industrial settings.

Abstract

In unsupervised anomaly detection (UAD) research, while state-of-the-art models have reached a saturation point with extensive studies on public benchmark datasets, they adopt large-scale tailor-made neural networks (NN) for detection performance or pursued unified models for various tasks. Towards edge computing, it is necessary to develop a computationally efficient and scalable solution that avoids large-scale complex NNs. Motivated by this, we aim to optimize the UAD performance with minimal changes to NN settings. Thus, we revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses. The strength of the SOTA methods is a single deterministic masking approach that addresses the challenges of random multiple masking that is inference latency and output inconsistency. Nevertheless, the issue of failure to provide a mask to completely cover anomalous regions is a remaining weakness. To mitigate this issue, we propose Feature Attenuation of Defective Representation (FADeR) that only employs two MLP layers which attenuates feature information of anomaly reconstruction during decoding. By leveraging FADeR, features of unseen anomaly patterns are reconstructed into seen normal patterns, reducing false alarms. Experimental results demonstrate that FADeR achieves enhanced performance compared to similar-scale NNs. Furthermore, our approach exhibits scalability in performance enhancement when integrated with other single deterministic masking methods in a plug-and-play manner.
Paper Structure (17 sections, 4 equations, 6 figures, 5 tables)

This paper contains 17 sections, 4 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Binary masks of pre-trained attention and soft masks of FADeR. Masks are applied to images or feature maps as an element-wise product. The color bar represents the value of the mask. When the value of the mask is closer to 0 (blue color), the attenuation is stronger. The soft mask attenuates feature representations of missing defective part of the incomplete binary attention mask. Although the binary mask is larger than the defective regions as shown in the capsule case, FADeR further will help cut out the core part of the defective representation on skip connections.
  • Figure 2: Visual defect masking methods in reconstruction-by-inpainting for UAD. (a) shows a method for masking suspected defects. We can optionally adopt a masking method. We adopt a single masking method because multiple masking methods suffer in inference latency. (b) shows a method of providing visual obfuscation-based hints in the masked regions. We leverage a strategy shown in (b) because it further improves the UAD performance of (a) by deriving accurate normal reconstruction EAR_Park_arXiv23.
  • Figure 3: The overview of our method, FADeR. We propose a simple two-layer MLP component to overcome the incomplete mask issue in single deterministic masking of visual defect obfuscation process. The visual obfuscation method is shown in Figure \ref{['fig:obfuscation']}. It attenuates defective feature representations in skip connections. The training detail of FADeR is shown in Figure \ref{['fig:prep_loss']}.
  • Figure 4: Construction of ground truth (GT) loss for training FADeR with only normal samples. After generating GT loss using $I$ and $\hat{I}$, FADeR learns to predict GT loss from $I'$ via MLP with loss function as \ref{['eq:loss_rank']}.
  • Figure 5: Inference process of FADeR. FADeR attenuates visual defective features by predicting patch-wise error.
  • ...and 1 more figures