DMP-3DAD: Cross-Category 3D Anomaly Detection via Realistic Depth Map Projection with Few Normal Samples
Zi Wang, Katsuya Hotta, Koichiro Kamide, Yawen Zou, Jianjian Qin, Chao Zhang, Jun Yu
TL;DR
This work tackles cross-category 3D anomaly detection under few-shot constraints, where only a handful of normal samples are available. It introduces DMP-3DAD, a training-free pipeline that converts point clouds into a fixed set of realistic depth maps across multiple views and extracts features with a frozen CLIP visual encoder. Anomaly scores are computed by view-weighted feature similarity between test samples and normal references, enabling category-agnostic detection without any training or prompts. On ShapeNetPart, DMP-3DAD achieves state-of-the-art mean AUROC across 1-, 3-, and 5-shot settings, demonstrating strong generalization and practical applicability for cross-category 3D anomaly detection.
Abstract
Cross-category anomaly detection for 3D point clouds aims to determine whether an unseen object belongs to a target category using only a few normal examples. Most existing methods rely on category-specific training, which limits their flexibility in few-shot scenarios. In this paper, we propose DMP-3DAD, a training-free framework for cross-category 3D anomaly detection based on multi-view realistic depth map projection. Specifically, by converting point clouds into a fixed set of realistic depth images, our method leverages a frozen CLIP visual encoder to extract multi-view representations and performs anomaly detection via weighted feature similarity, which does not require any fine-tuning or category-dependent adaptation. Extensive experiments on the ShapeNetPart dataset demonstrate that DMP-3DAD achieves state-of-the-art performance under few-shot setting. The results show that the proposed approach provides a simple yet effective solution for practical cross-category 3D anomaly detection.
