Omni-AD: Learning to Reconstruct Global and Local Features for Multi-class Anomaly Detection
Jiajie Quan, Ao Tong, Yuxuan Cai, Xinwei He, Yulong Wang, Yang Zhou
TL;DR
Omni-AD targets multi-class unsupervised anomaly detection by learning both global and local normal patterns to avoid reconstruction shortcuts. It introduces Omni-block, a two-branch decoder that combines a global attention pathway with learnable tokens and a local depthwise convolution pathway to reconstruct normal features across scales. The approach achieves state-of-the-art results on MVTec-AD, VisA, and Real-IAD, with ablations validating the effectiveness of learnable tokens, the necessity of both branches, and the chosen architectural depths. This work offers a practical, efficient, unified MUAD solution with potential applications beyond industrial anomaly detection.
Abstract
In multi-class unsupervised anomaly detection(MUAD), reconstruction-based methods learn to map input images to normal patterns to identify anomalous pixels. However, this strategy easily falls into the well-known "learning shortcut" issue when decoders fail to capture normal patterns and reconstruct both normal and abnormal samples naively. To address that, we propose to learn the input features in global and local manners, forcing the network to memorize the normal patterns more comprehensively. Specifically, we design a two-branch decoder block, named Omni-block. One branch corresponds to global feature learning, where we serialize two self-attention blocks but replace the query and (key, value) with learnable tokens, respectively, thus capturing global features of normal patterns concisely and thoroughly. The local branch comprises depth-separable convolutions, whose locality enables effective and efficient learning of local features for normal patterns. By stacking Omni-blocks, we build a framework, Omni-AD, to learn normal patterns of different granularity and reconstruct them progressively. Comprehensive experiments on public anomaly detection benchmarks show that our method outperforms state-of-the-art approaches in MUAD. Code is available at https://github.com/easyoo/Omni-AD.git
