Table of Contents
Fetching ...

Multi-Scale Distillation for RGB-D Anomaly Detection on the PD-REAL Dataset

Jianjian Qin, Chao Zhang, Chunzhi Gu, Zi Wang, Jun Yu, Yijin Wei, Hui Xiao, Xin Yua

TL;DR

This work presents PD-REAL, a novel large-scale dataset for unsupervised anomaly detection (AD) in the 3D domain, and introduces a multi-scale teacher--student framework with hierarchical distillation for multimodal anomaly detection.

Abstract

We present PD-REAL, a novel large-scale dataset for unsupervised anomaly detection (AD) in the 3D domain. It is motivated by the fact that 2D-only representations in the AD task may fail to capture the geometric structures of anomalies due to uncertainty in lighting conditions or shooting angles. PD-REAL consists entirely of Play-Doh models for 15 object categories and focuses on the analysis of potential benefits from 3D information in a controlled environment. Specifically, objects are first created with six types of anomalies, such as \textit{dent}, \textit{crack}, or \textit{perforation}, and then photographed under different lighting conditions to mimic real-world inspection scenarios. To demonstrate the usefulness of 3D information, we use a commercially available RealSense camera to capture RGB and depth images. Compared to the existing 3D dataset for AD tasks, the data acquisition of PD-REAL is significantly cheaper, easily scalable, and easier to control variables. \qin{Furthermore, we introduce a multi-scale teacher--student framework with hierarchical distillation for multimodal anomaly detection. This architecture overcomes the inherent limitation of single-scale distillation approaches, which often struggle to reconcile global context with local features. Leveraging multi-level guidance from the teacher network, the student network can effectively capture richer features for anomaly detection. Extensive evaluations with our method and state-of-the-art AD algorithms on our dataset qualitatively and quantitatively demonstrate the higher detection accuracy of our method. }Our dataset can be downloaded from https://github.com/Andy-cs008/PD-REAL

Multi-Scale Distillation for RGB-D Anomaly Detection on the PD-REAL Dataset

TL;DR

This work presents PD-REAL, a novel large-scale dataset for unsupervised anomaly detection (AD) in the 3D domain, and introduces a multi-scale teacher--student framework with hierarchical distillation for multimodal anomaly detection.

Abstract

We present PD-REAL, a novel large-scale dataset for unsupervised anomaly detection (AD) in the 3D domain. It is motivated by the fact that 2D-only representations in the AD task may fail to capture the geometric structures of anomalies due to uncertainty in lighting conditions or shooting angles. PD-REAL consists entirely of Play-Doh models for 15 object categories and focuses on the analysis of potential benefits from 3D information in a controlled environment. Specifically, objects are first created with six types of anomalies, such as \textit{dent}, \textit{crack}, or \textit{perforation}, and then photographed under different lighting conditions to mimic real-world inspection scenarios. To demonstrate the usefulness of 3D information, we use a commercially available RealSense camera to capture RGB and depth images. Compared to the existing 3D dataset for AD tasks, the data acquisition of PD-REAL is significantly cheaper, easily scalable, and easier to control variables. \qin{Furthermore, we introduce a multi-scale teacher--student framework with hierarchical distillation for multimodal anomaly detection. This architecture overcomes the inherent limitation of single-scale distillation approaches, which often struggle to reconcile global context with local features. Leveraging multi-level guidance from the teacher network, the student network can effectively capture richer features for anomaly detection. Extensive evaluations with our method and state-of-the-art AD algorithms on our dataset qualitatively and quantitatively demonstrate the higher detection accuracy of our method. }Our dataset can be downloaded from https://github.com/Andy-cs008/PD-REAL
Paper Structure (18 sections, 4 equations, 9 figures, 6 tables)

This paper contains 18 sections, 4 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: An example of the anomalous sample with the anomaly type dent. The anomaly is clearly visible in the 3D point cloud (b) compared to the 2D RGB image (a).
  • Figure 2: Example images from the proposed PD-REAL dataset. Under each image, the first row indicates the category, and the second row indicates the anomaly type highlighted with the blue box/category-wise number of anomaly types/category-wise number of total anomalous samples/lighting condition. The lighting conditions are shortened in C: controlled, U: uncontrolled, and M: mixed for brevity.
  • Figure 3: Distribution of anomalous samples in PD-REAL across the six anomaly types and 15 object categories. The histogram summarizes the number of anomalous samples per category.
  • Figure 4: PD-REAL visualizations for car, airplane, train, bicycle, and pizza. For each category, two anomalous samples are shown. From top to bottom: RGB image, ground-truth , 3D point cloud, and 3D coordinate encodings along the $x$-, $y$-, and $z$-axes.
  • Figure 5: PD-REAL visualizations for chicken, banana, cookie, bread, and sushi. For each category, two anomalous samples are shown. From top to bottom: RGB image, ground-truth, 3D point cloud, and 3D coordinate encodings along the $x$-, $y$-, and $z$-axes.
  • ...and 4 more figures