Table of Contents
Fetching ...

Leveraging Learning Bias for Noisy Anomaly Detection

Yuxin Zhang, Yunkang Cao, Yuqi Cheng, Yihan Sun, Weiming Shen

TL;DR

This work tackles fully unsupervised image anomaly detection under training contamination by uncovering a learning bias that normals learn faster and anomalies are more diverse. It introduces a two-stage framework: Stage 1 trains multiple sub-models on partitioned subsets to obtain cross-model consensus and purify the training data, and Stage 2 trains the final detector on the purified set. The approach, demonstrated on Real-IAD with a Dinomaly backbone, achieves state-of-the-art anomaly localization across contamination levels and shows robustness to noise through targeted subset partitioning and score aggregation. The method offers a practical, model-agnostic solution for real-world settings where clean anomaly-free data are unavailable, with notable implications for industrial inspection and beyond.

Abstract

This paper addresses the challenge of fully unsupervised image anomaly detection (FUIAD), where training data may contain unlabeled anomalies. Conventional methods assume anomaly-free training data, but real-world contamination leads models to absorb anomalies as normal, degrading detection performance. To mitigate this, we propose a two-stage framework that systematically exploits inherent learning bias in models. The learning bias stems from: (1) the statistical dominance of normal samples, driving models to prioritize learning stable normal patterns over sparse anomalies, and (2) feature-space divergence, where normal data exhibit high intra-class consistency while anomalies display high diversity, leading to unstable model responses. Leveraging the learning bias, stage 1 partitions the training set into subsets, trains sub-models, and aggregates cross-model anomaly scores to filter a purified dataset. Stage 2 trains the final detector on this dataset. Experiments on the Real-IAD benchmark demonstrate superior anomaly detection and localization performance under different noise conditions. Ablation studies further validate the framework's contamination resilience, emphasizing the critical role of learning bias exploitation. The model-agnostic design ensures compatibility with diverse unsupervised backbones, offering a practical solution for real-world scenarios with imperfect training data. Code is available at https://github.com/hustzhangyuxin/LLBNAD.

Leveraging Learning Bias for Noisy Anomaly Detection

TL;DR

This work tackles fully unsupervised image anomaly detection under training contamination by uncovering a learning bias that normals learn faster and anomalies are more diverse. It introduces a two-stage framework: Stage 1 trains multiple sub-models on partitioned subsets to obtain cross-model consensus and purify the training data, and Stage 2 trains the final detector on the purified set. The approach, demonstrated on Real-IAD with a Dinomaly backbone, achieves state-of-the-art anomaly localization across contamination levels and shows robustness to noise through targeted subset partitioning and score aggregation. The method offers a practical, model-agnostic solution for real-world settings where clean anomaly-free data are unavailable, with notable implications for industrial inspection and beyond.

Abstract

This paper addresses the challenge of fully unsupervised image anomaly detection (FUIAD), where training data may contain unlabeled anomalies. Conventional methods assume anomaly-free training data, but real-world contamination leads models to absorb anomalies as normal, degrading detection performance. To mitigate this, we propose a two-stage framework that systematically exploits inherent learning bias in models. The learning bias stems from: (1) the statistical dominance of normal samples, driving models to prioritize learning stable normal patterns over sparse anomalies, and (2) feature-space divergence, where normal data exhibit high intra-class consistency while anomalies display high diversity, leading to unstable model responses. Leveraging the learning bias, stage 1 partitions the training set into subsets, trains sub-models, and aggregates cross-model anomaly scores to filter a purified dataset. Stage 2 trains the final detector on this dataset. Experiments on the Real-IAD benchmark demonstrate superior anomaly detection and localization performance under different noise conditions. Ablation studies further validate the framework's contamination resilience, emphasizing the critical role of learning bias exploitation. The model-agnostic design ensures compatibility with diverse unsupervised backbones, offering a practical solution for real-world scenarios with imperfect training data. Code is available at https://github.com/hustzhangyuxin/LLBNAD.

Paper Structure

This paper contains 22 sections, 5 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Multiple Subset Integration for Robust Anomaly Detection. (a) Overall data distribution with concentrated normal samples (green) and scattered anomalies (red). (b)-(c) Dark red and dark green points denote anomalous and normal samples within the subsets used to train individual sub-models. Green dashed boundaries enclose samples assigned low anomaly scores by each sub-model when evaluated across the entire training set. (d) The green-shaded region identifies high-confidence normal samples consistently achieving low anomaly scores across all sub-models.
  • Figure 2: Framework of the proposed method. Our framework comprises two stages. In stage 1, $\mathcal{D}_{\text{train}}$ is partitioned into $k$ subsets for individual model training. Each model is then tested on the whole training set to yield anomaly scores. The scores are aggregated into consensus anomaly scores, sorted, and the t% of samples with the lowest scores are selected as $\mathcal{D}_{\text{pure}}$. In stage 2, the final model is trained on $\mathcal{D}_{\text{pure}}$ for anomaly detection.
  • Figure 3: Qualitative results of the proposed method. From top to bottom: the input image, the ground truth masks, and the output anomaly maps of noise ratio $\alpha=0.1, 0.2, 0.4$.
  • Figure 4: Normal vs. Abnormal Samples in Sub-models and Aggregated Models under Different Noise Ratios. #Normal and #Anomaly denote the number of selected normal and anomaly samples, respectively.