Table of Contents
Fetching ...

MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images

Xurui Li, Ziming Huang, Feng Xue, Yu Zhou

TL;DR

MuSc tackles zero-shot industrial anomaly classification and segmentation by exploiting unlabeled test images. It introduces Local Neighborhood Aggregation with Multiple Degrees (LNAMD) to produce multi-scale patch representations, a Mutual Scoring Mechanism (MSM) to infer patch-level anomaly scores by cross-comparing patches across unlabeled images, and a Re-scoring with Constrained Image-level Neighborhood (RsCIN) to refine image-level classifications. The approach achieves substantial gains over existing zero-shot baselines on the MVTec AD and VisA benchmarks and remains competitive with several few-shot and some full-shot methods, all without labeled training data or prompts. By mining normal and abnormal cues latent in unlabeled test images, MuSc demonstrates a practical, data-efficient path for industrial anomaly detection tasks with real-world impact in quality control and manufacturing inspection.

Abstract

This paper studies zero-shot anomaly classification (AC) and segmentation (AS) in industrial vision. We reveal that the abundant normal and abnormal cues implicit in unlabeled test images can be exploited for anomaly determination, which is ignored by prior methods. Our key observation is that for the industrial product images, the normal image patches could find a relatively large number of similar patches in other unlabeled images, while the abnormal ones only have a few similar patches. We leverage such a discriminative characteristic to design a novel zero-shot AC/AS method by Mutual Scoring (MuSc) of the unlabeled images, which does not need any training or prompts. Specifically, we perform Local Neighborhood Aggregation with Multiple Degrees (LNAMD) to obtain the patch features that are capable of representing anomalies in varying sizes. Then we propose the Mutual Scoring Mechanism (MSM) to leverage the unlabeled test images to assign the anomaly score to each other. Furthermore, we present an optimization approach named Re-scoring with Constrained Image-level Neighborhood (RsCIN) for image-level anomaly classification to suppress the false positives caused by noises in normal images. The superior performance on the challenging MVTec AD and VisA datasets demonstrates the effectiveness of our approach. Compared with the state-of-the-art zero-shot approaches, MuSc achieves a $\textbf{21.1%}$ PRO absolute gain (from 72.7% to 93.8%) on MVTec AD, a $\textbf{19.4%}$ pixel-AP gain and a $\textbf{14.7%}$ pixel-AUROC gain on VisA. In addition, our zero-shot approach outperforms most of the few-shot approaches and is comparable to some one-class methods. Code is available at https://github.com/xrli-U/MuSc.

MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images

TL;DR

MuSc tackles zero-shot industrial anomaly classification and segmentation by exploiting unlabeled test images. It introduces Local Neighborhood Aggregation with Multiple Degrees (LNAMD) to produce multi-scale patch representations, a Mutual Scoring Mechanism (MSM) to infer patch-level anomaly scores by cross-comparing patches across unlabeled images, and a Re-scoring with Constrained Image-level Neighborhood (RsCIN) to refine image-level classifications. The approach achieves substantial gains over existing zero-shot baselines on the MVTec AD and VisA benchmarks and remains competitive with several few-shot and some full-shot methods, all without labeled training data or prompts. By mining normal and abnormal cues latent in unlabeled test images, MuSc demonstrates a practical, data-efficient path for industrial anomaly detection tasks with real-world impact in quality control and manufacturing inspection.

Abstract

This paper studies zero-shot anomaly classification (AC) and segmentation (AS) in industrial vision. We reveal that the abundant normal and abnormal cues implicit in unlabeled test images can be exploited for anomaly determination, which is ignored by prior methods. Our key observation is that for the industrial product images, the normal image patches could find a relatively large number of similar patches in other unlabeled images, while the abnormal ones only have a few similar patches. We leverage such a discriminative characteristic to design a novel zero-shot AC/AS method by Mutual Scoring (MuSc) of the unlabeled images, which does not need any training or prompts. Specifically, we perform Local Neighborhood Aggregation with Multiple Degrees (LNAMD) to obtain the patch features that are capable of representing anomalies in varying sizes. Then we propose the Mutual Scoring Mechanism (MSM) to leverage the unlabeled test images to assign the anomaly score to each other. Furthermore, we present an optimization approach named Re-scoring with Constrained Image-level Neighborhood (RsCIN) for image-level anomaly classification to suppress the false positives caused by noises in normal images. The superior performance on the challenging MVTec AD and VisA datasets demonstrates the effectiveness of our approach. Compared with the state-of-the-art zero-shot approaches, MuSc achieves a PRO absolute gain (from 72.7% to 93.8%) on MVTec AD, a pixel-AP gain and a pixel-AUROC gain on VisA. In addition, our zero-shot approach outperforms most of the few-shot approaches and is comparable to some one-class methods. Code is available at https://github.com/xrli-U/MuSc.
Paper Structure (28 sections, 11 equations, 16 figures, 18 tables)

This paper contains 28 sections, 11 equations, 16 figures, 18 tables.

Figures (16)

  • Figure 1: (a) One-class AC and AS based on a memory bank, which requires many normal reference images. (b) Zero-shot AC and AS based on CLIP, which relies on additional text prompts. (c) Our MuSc only leverages the unlabeled test images from patch-level and image-level.
  • Figure 2: Our MuSc architecture. It consists of three parts: feature extraction and optimization (Section \ref{['Sec:feature-agg']}), MSM to obtain anomaly segmentation results (Section \ref{['Sec:patch-level']}), and RsCIN to optimize classification results (Section \ref{['Sec:image-level']}).
  • Figure 3: The visualization of anomaly segmentation results with different aggregation degrees $r$.
  • Figure 4: Top: The histograms of $A_{i,l}^{m,r}$ of all normal patches (a) and abnormal patches (b). Bottom: The histograms of $\overline{a}_{i,l}^{m,r}$ of normal patches and abnormal patches with the whole value interval (c) and the minimum $X\%$ value interval (d) to average. The blue and orange parts indicate the anomaly scores of normal patches and abnormal patches respectively under $r=3$ and $l=2$ setting.
  • Figure 5: Top: Histogram of anomaly classification scores of unlabeled test images before (a) and after (b) using RsCIN, with blue representing normal images and orange representing abnormal images. Bottom: A normal and an abnormal example of RsCIN.
  • ...and 11 more figures