Table of Contents
Fetching ...

AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios

Ziming Huang, Xurui Li, Haotian Liu, Feng Xue, Yuzhe Wang, Yu Zhou

TL;DR

AnomalyNCD addresses the challenge of discovering novel anomaly classes in industrial settings by learning from isolated anomaly regions rather than whole images. It introduces Main Element Binarization (MEBin) to produce anomaly-centered inputs, Mask-Guided Representation Learning (MGViT) to focus representations on anomalous regions, and a region merging strategy to robustly classify at both region and image levels. The approach is compatible with various anomaly detectors and leverages self-supervised and pseudo-labeled supervision to learn discriminative, region-specific features, achieving state-of-the-art gains on MVTec AD and MTD when combined with zero-shot anomaly detection. These components together enable effective multi-class anomaly discovery with practical robustness to detector quality and complex, combined-type anomalies in industrial data.

Abstract

Recently, multi-class anomaly classification has garnered increasing attention. Previous methods directly cluster anomalies but often struggle due to the lack of anomaly-prior knowledge. Acquiring this knowledge faces two issues: the non-prominent and weak-semantics anomalies. In this paper, we propose AnomalyNCD, a multi-class anomaly classification network compatible with different anomaly detection methods. To address the non-prominence of anomalies, we design main element binarization (MEBin) to obtain anomaly-centered images, ensuring anomalies are learned while avoiding the impact of incorrect detections. Next, to learn anomalies with weak semantics, we design mask-guided representation learning, which focuses on isolated anomalies guided by masks and reduces confusion from erroneous inputs through corrected pseudo labels. Finally, to enable flexible classification at both region and image levels, we develop a region merging strategy that determines the overall image category based on the classified anomaly regions. Our method outperforms the state-of-the-art works on the MVTec AD and MTD datasets. Compared with the current methods, AnomalyNCD combined with zero-shot anomaly detection method achieves a 10.8% $F_1$ gain, 8.8% NMI gain, and 9.5% ARI gain on MVTec AD, and 12.8% $F_1$ gain, 5.7% NMI gain, and 10.8% ARI gain on MTD. Code is available at https://github.com/HUST-SLOW/AnomalyNCD.

AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios

TL;DR

AnomalyNCD addresses the challenge of discovering novel anomaly classes in industrial settings by learning from isolated anomaly regions rather than whole images. It introduces Main Element Binarization (MEBin) to produce anomaly-centered inputs, Mask-Guided Representation Learning (MGViT) to focus representations on anomalous regions, and a region merging strategy to robustly classify at both region and image levels. The approach is compatible with various anomaly detectors and leverages self-supervised and pseudo-labeled supervision to learn discriminative, region-specific features, achieving state-of-the-art gains on MVTec AD and MTD when combined with zero-shot anomaly detection. These components together enable effective multi-class anomaly discovery with practical robustness to detector quality and complex, combined-type anomalies in industrial data.

Abstract

Recently, multi-class anomaly classification has garnered increasing attention. Previous methods directly cluster anomalies but often struggle due to the lack of anomaly-prior knowledge. Acquiring this knowledge faces two issues: the non-prominent and weak-semantics anomalies. In this paper, we propose AnomalyNCD, a multi-class anomaly classification network compatible with different anomaly detection methods. To address the non-prominence of anomalies, we design main element binarization (MEBin) to obtain anomaly-centered images, ensuring anomalies are learned while avoiding the impact of incorrect detections. Next, to learn anomalies with weak semantics, we design mask-guided representation learning, which focuses on isolated anomalies guided by masks and reduces confusion from erroneous inputs through corrected pseudo labels. Finally, to enable flexible classification at both region and image levels, we develop a region merging strategy that determines the overall image category based on the classified anomaly regions. Our method outperforms the state-of-the-art works on the MVTec AD and MTD datasets. Compared with the current methods, AnomalyNCD combined with zero-shot anomaly detection method achieves a 10.8% gain, 8.8% NMI gain, and 9.5% ARI gain on MVTec AD, and 12.8% gain, 5.7% NMI gain, and 10.8% ARI gain on MTD. Code is available at https://github.com/HUST-SLOW/AnomalyNCD.

Paper Structure

This paper contains 43 sections, 12 equations, 18 figures, 17 tables.

Figures (18)

  • Figure 1: Comparison between solutions organizing anomalies into groups. (a) Anomaly clustering methods extract the features of the anomaly region and employ unsupervised clustering algorithms to cluster the anomalies. (b) Vanilla NCD methods typically employ a trainable feature extractor and classifier. These components process object-centered images from both known and unknown classes. (c) Our method aims to learn features of isolated anomaly regions using anomaly-centered sub-images and masks from MEBin.
  • Figure 2: Illustration of the Training Process for AnomalyNCD. First, we apply main element binarization (MEBin) to segment anomaly masks from detection results and generate anomaly-centered sub-images. Second, we introduce mask-guided representation learning to learn discriminative features of anomalies, classifying sub-images into various categories.
  • Figure 3: Pipeline of the Main Element Binarization. In 3D visualization, we show the changes of segmented regions under different thresholds. The 3D map $\textup{(a)}$ corresponds to the 2D binary segmentation mask $\emph{(a)}$.
  • Figure 4: Visualization of the self-attention of the $\texttt{[CLS]}$ token on the last layer's heads. DINO attention refers to the $\texttt{[CLS]}$ token extracted from a DINO pre-trained ViT that mainly focuses on a foreground object. AnomalyNCD uses a mask to direct the $\texttt{[CLS]}$ token's attention to the anomalous regions.
  • Figure 5: Region merging strategy for image-level classification. Using average prediction, the output of normal cropped images (Sub-image-2) leads to final misclassification, while area-weighted prediction can reduce the negative effect of normal cropped images on the final result.
  • ...and 13 more figures