A Recover-then-Discriminate Framework for Robust Anomaly Detection
Peng Xing, Dong Zhang, Jinhui Tang, Zechao li
TL;DR
This work addresses gaps in anomaly detection by identifying two core failure modes in recovery-based methods and proposing a Recover-then-Discriminate (ReDi) framework. The Recover Network employs HIP, which uses a self-generated map (e.g., HOG) and a similar normal image as a prompt to guide recovery without revealing anomalous details, mitigating recovery shortcuts. The Discriminate Network compares multi-scale features from a reference (pre-trained) branch and a recovered-branch, guided by a cosine-alignment loss $L_D$ and a self-correlation loss $L_S$, enabling effective detection and precise segmentation in feature space. Extensive experiments on MVTec-AD and KolektorSDD2 show state-of-the-art or competitive performance for both anomaly detection and segmentation, with ablations validating the benefits of HIP, $L_S$, and the FRB, and a thorough analysis of self-generated maps and prompts. The approach offers a practical, unsupervised AD solution that leverages low-level structure and normal semantic information to robustly identify anomalies in complex visual data.
Abstract
Anomaly detection (AD) has been extensively studied and applied in a wide range of scenarios in the recent past. However, there are still gaps between achieved and desirable levels of recognition accuracy for making AD for practical applications. In this paper, we start from an insightful analysis of two types of fundamental yet representative failure cases in the baseline model, and reveal reasons that hinder current AD methods from achieving a higher recognition accuracy. Specifically, by Case-1, we found that the main reasons detrimental to current AD methods is that the inputs to the recovery model contain a large number of detailed features to be recovered, which leads to the normal/abnormal area has-not/has been recovered into its original state. By Case-2, we surprisingly found that the abnormal area that cannot be recognized in image-level representations can be easily recognized in the feature-level representation. Based on the above observations, we propose a novel Recover-then-Discriminate (ReDi) framework for AD. ReDi takes a self-generated feature map and a selected prompted image as explicit input information to solve problems in case-1. Concurrently, a feature-level discriminative network is proposed to enhance abnormal differences between the recovered representation and the input representation. Extensive experimental results on two popular yet challenging AD datasets validate that ReDi achieves the new state-of-the-art accuracy.
