A Hierarchically Feature Reconstructed Autoencoder for Unsupervised Anomaly Detection
Honghui Chen, Pingping Chen, Huan Mao, Mengxi Jiang
TL;DR
This work tackles unsupervised anomaly detection and localization without labeled anomalies or data augmentation by introducing a simple encoder–decoder architecture. A fixed, ImageNet-pretrained encoder extracts hierarchical features, and a decoder reconstructs these features across $K=3$ levels to generate multi-scale residual maps used for anomaly detection and localization. By training only the decoder to minimize multi-level feature reconstruction losses and fusing residuals into an anomaly map, the method achieves strong performance across MNIST, Fashion-MNIST, CIFAR-10, and MVTecAD, often surpassing state-of-the-art approaches. The approach is notable for its simplicity, efficiency (single forward pass during inference), and effectiveness in leveraging feature-space reconstruction rather than pixel-level recovery.
Abstract
Anomaly detection and localization without any manual annotations and prior knowledge is a challenging task under the setting of unsupervised learning. The existing works achieve excellent performance in the anomaly detection, but with complex networks or cumbersome pipelines. To address this issue, this paper explores a simple but effective architecture in the anomaly detection. It consists of a well pre-trained encoder to extract hierarchical feature representations and a decoder to reconstruct these intermediate features from the encoder. In particular, it does not require any data augmentations and anomalous images for training. The anomalies can be detected when the decoder fails to reconstruct features well, and then errors of hierarchical feature reconstruction are aggregated into an anomaly map to achieve anomaly localization. The difference comparison between those features of encoder and decode lead to more accurate and robust localization results than the comparison in single feature or pixel-by-pixel comparison in the conventional works. Experiment results show that the proposed method outperforms the state-of-the-art methods on MNIST, Fashion-MNIST, CIFAR-10, and MVTec Anomaly Detection datasets on both anomaly detection and localization.
