Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection

YeongHyeon Park; Sungho Kang; Myung Jin Kim; Hyeong Seok Kim; Juneho Yi

Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection

YeongHyeon Park, Sungho Kang, Myung Jin Kim, Hyeong Seok Kim, Juneho Yi

TL;DR

This work addresses incomplete masking in reconstruction-by-inpainting-based unsupervised anomaly detection (UAD) by introducing FADeR, a lightweight two-layer MLP that attenuates defective representations through soft masking on skip connections. Trained with an active-learning strategy that derives patch-wise attenuation signals from reconstruction errors, FADeR does not require labeled defect data. Integrated with a pre-trained UAD model using a single deterministic mask, FADeR improves image- and pixel-level AUROC on MVTec AD and VisA and demonstrates plug-and-play compatibility with other masking approaches. The results show that a compact, edge-friendly module can significantly enhance UAD performance without large-scale architectural changes, enabling practical deployment in industrial settings.

Abstract

In unsupervised anomaly detection (UAD) research, while state-of-the-art models have reached a saturation point with extensive studies on public benchmark datasets, they adopt large-scale tailor-made neural networks (NN) for detection performance or pursued unified models for various tasks. Towards edge computing, it is necessary to develop a computationally efficient and scalable solution that avoids large-scale complex NNs. Motivated by this, we aim to optimize the UAD performance with minimal changes to NN settings. Thus, we revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses. The strength of the SOTA methods is a single deterministic masking approach that addresses the challenges of random multiple masking that is inference latency and output inconsistency. Nevertheless, the issue of failure to provide a mask to completely cover anomalous regions is a remaining weakness. To mitigate this issue, we propose Feature Attenuation of Defective Representation (FADeR) that only employs two MLP layers which attenuates feature information of anomaly reconstruction during decoding. By leveraging FADeR, features of unseen anomaly patterns are reconstructed into seen normal patterns, reducing false alarms. Experimental results demonstrate that FADeR achieves enhanced performance compared to similar-scale NNs. Furthermore, our approach exhibits scalability in performance enhancement when integrated with other single deterministic masking methods in a plug-and-play manner.

Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection

TL;DR

Abstract

Paper Structure (17 sections, 4 equations, 6 figures, 5 tables)

This paper contains 17 sections, 4 equations, 6 figures, 5 tables.

Introduction
Related work
Reconstruction-by-inpainting methods
Feature representations on UAD
Methods
Overview
Method to resolve incomplete mask
Soft mask for relative attenuation
Experiments
Experimental setup
Visual analysis of soft mask
Performance of industrial inspection
Generalizing FADeR on other attentions
Ablation study
Experiments with other masking methods
...and 2 more sections

Figures (6)

Figure 1: Binary masks of pre-trained attention and soft masks of FADeR. Masks are applied to images or feature maps as an element-wise product. The color bar represents the value of the mask. When the value of the mask is closer to 0 (blue color), the attenuation is stronger. The soft mask attenuates feature representations of missing defective part of the incomplete binary attention mask. Although the binary mask is larger than the defective regions as shown in the capsule case, FADeR further will help cut out the core part of the defective representation on skip connections.
Figure 2: Visual defect masking methods in reconstruction-by-inpainting for UAD. (a) shows a method for masking suspected defects. We can optionally adopt a masking method. We adopt a single masking method because multiple masking methods suffer in inference latency. (b) shows a method of providing visual obfuscation-based hints in the masked regions. We leverage a strategy shown in (b) because it further improves the UAD performance of (a) by deriving accurate normal reconstruction EAR_Park_arXiv23.
Figure 3: The overview of our method, FADeR. We propose a simple two-layer MLP component to overcome the incomplete mask issue in single deterministic masking of visual defect obfuscation process. The visual obfuscation method is shown in Figure \ref{['fig:obfuscation']}. It attenuates defective feature representations in skip connections. The training detail of FADeR is shown in Figure \ref{['fig:prep_loss']}.
Figure 4: Construction of ground truth (GT) loss for training FADeR with only normal samples. After generating GT loss using $I$ and $\hat{I}$, FADeR learns to predict GT loss from $I'$ via MLP with loss function as \ref{['eq:loss_rank']}.
Figure 5: Inference process of FADeR. FADeR attenuates visual defective features by predicting patch-wise error.
...and 1 more figures

Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection

TL;DR

Abstract

Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (6)