Leveraging Learning Bias for Noisy Anomaly Detection
Yuxin Zhang, Yunkang Cao, Yuqi Cheng, Yihan Sun, Weiming Shen
TL;DR
This work tackles fully unsupervised image anomaly detection under training contamination by uncovering a learning bias that normals learn faster and anomalies are more diverse. It introduces a two-stage framework: Stage 1 trains multiple sub-models on partitioned subsets to obtain cross-model consensus and purify the training data, and Stage 2 trains the final detector on the purified set. The approach, demonstrated on Real-IAD with a Dinomaly backbone, achieves state-of-the-art anomaly localization across contamination levels and shows robustness to noise through targeted subset partitioning and score aggregation. The method offers a practical, model-agnostic solution for real-world settings where clean anomaly-free data are unavailable, with notable implications for industrial inspection and beyond.
Abstract
This paper addresses the challenge of fully unsupervised image anomaly detection (FUIAD), where training data may contain unlabeled anomalies. Conventional methods assume anomaly-free training data, but real-world contamination leads models to absorb anomalies as normal, degrading detection performance. To mitigate this, we propose a two-stage framework that systematically exploits inherent learning bias in models. The learning bias stems from: (1) the statistical dominance of normal samples, driving models to prioritize learning stable normal patterns over sparse anomalies, and (2) feature-space divergence, where normal data exhibit high intra-class consistency while anomalies display high diversity, leading to unstable model responses. Leveraging the learning bias, stage 1 partitions the training set into subsets, trains sub-models, and aggregates cross-model anomaly scores to filter a purified dataset. Stage 2 trains the final detector on this dataset. Experiments on the Real-IAD benchmark demonstrate superior anomaly detection and localization performance under different noise conditions. Ablation studies further validate the framework's contamination resilience, emphasizing the critical role of learning bias exploitation. The model-agnostic design ensures compatibility with diverse unsupervised backbones, offering a practical solution for real-world scenarios with imperfect training data. Code is available at https://github.com/hustzhangyuxin/LLBNAD.
