Table of Contents
Fetching ...

CSAD: Unsupervised Component Segmentation for Logical Anomaly Detection

Yu-Hsuan Hsieh, Shang-Hong Lai

TL;DR

CSAD introduces unsupervised semantic pseudo-label generation to train a lightweight component segmentation network for logical anomaly detection. It combines a Patch Histogram module with a Local-Global Student-Teacher (LGST) framework to detect both position/quantity-based and scale-based anomalies, achieving a total AUROC of 95.3% on MVTec LOCO AD with low latency (8.9 ms) and high throughput (321.8 fps). Semantic pseudo-labels are produced without human annotations via RAM++ and Grounded-SAM, followed by segmentation training with LSA augmentation and multiple losses. The method demonstrates strong practical impact by reducing labeling requirements while delivering state-of-the-art performance and efficiency in industrial anomaly detection tasks.

Abstract

To improve logical anomaly detection, some previous works have integrated segmentation techniques with conventional anomaly detection methods. Although these methods are effective, they frequently lead to unsatisfactory segmentation results and require manual annotations. To address these drawbacks, we develop an unsupervised component segmentation technique that leverages foundation models to autonomously generate training labels for a lightweight segmentation network without human labeling. Integrating this new segmentation technique with our proposed Patch Histogram module and the Local-Global Student-Teacher (LGST) module, we achieve a detection AUROC of 95.3% in the MVTec LOCO AD dataset, which surpasses previous SOTA methods. Furthermore, our proposed method provides lower latency and higher throughput than most existing approaches.

CSAD: Unsupervised Component Segmentation for Logical Anomaly Detection

TL;DR

CSAD introduces unsupervised semantic pseudo-label generation to train a lightweight component segmentation network for logical anomaly detection. It combines a Patch Histogram module with a Local-Global Student-Teacher (LGST) framework to detect both position/quantity-based and scale-based anomalies, achieving a total AUROC of 95.3% on MVTec LOCO AD with low latency (8.9 ms) and high throughput (321.8 fps). Semantic pseudo-labels are produced without human annotations via RAM++ and Grounded-SAM, followed by segmentation training with LSA augmentation and multiple losses. The method demonstrates strong practical impact by reducing labeling requirements while delivering state-of-the-art performance and efficiency in industrial anomaly detection tasks.

Abstract

To improve logical anomaly detection, some previous works have integrated segmentation techniques with conventional anomaly detection methods. Although these methods are effective, they frequently lead to unsatisfactory segmentation results and require manual annotations. To address these drawbacks, we develop an unsupervised component segmentation technique that leverages foundation models to autonomously generate training labels for a lightweight segmentation network without human labeling. Integrating this new segmentation technique with our proposed Patch Histogram module and the Local-Global Student-Teacher (LGST) module, we achieve a detection AUROC of 95.3% in the MVTec LOCO AD dataset, which surpasses previous SOTA methods. Furthermore, our proposed method provides lower latency and higher throughput than most existing approaches.
Paper Structure (34 sections, 4 equations, 13 figures, 8 tables)

This paper contains 34 sections, 4 equations, 13 figures, 8 tables.

Figures (13)

  • Figure 1: The speed-performance plot on the MVTec LOCO AD benchmark. The x- and y-axis indicate inference latency and average detection AUROC, respectively.
  • Figure 2: Proposed semantic pseudo-label generation that generates semantic pseudo-labels from normal images only.
  • Figure 3: Segmentation result of five categories from MVTec LOCO AD. The red bounding box indicates the anomalous region of the image, and the color in the segmentation image represents the class label of the pixel.
  • Figure 4: An example illustrating the effectiveness of patch histograms in addressing position-related logical anomalies.
  • Figure 5: Overall architecture of the proposed CSAD in the inference stage. It consists of two branches: a Patch Histogram branch that detects anomalies using component segmentation and an LGST branch that detects both small and large-scale anomalies.
  • ...and 8 more figures