PointAD+: Learning Hierarchical Representations for Zero-shot 3D Anomaly Detection

Qihang Zhou; Shibo He; Jiangtao Yan; Wenchao Meng; Jiming Chen

PointAD+: Learning Hierarchical Representations for Zero-shot 3D Anomaly Detection

Qihang Zhou, Shibo He, Jiangtao Yan, Wenchao Meng, Jiming Chen

TL;DR

This work tackles zero-shot 3D anomaly detection for unseen objects by transferring CLIP's 2D generalization to 3D. It introduces PointAD, which uses implicit point representations derived from renderings, and PointAD+, which adds explicit geometry-aware representations via G-aggregation and hierarchical representation learning. A cross-hierarchy contrastive alignment unifies rendering-based and geometry-based anomaly semantics, enabling robust 3D and multimodal (RGB-inclusive) detection without retraining CLIP. Extensive experiments across three datasets show state-of-the-art performance in ZS 3D and multimodal 3D anomaly detection, with thorough ablations validating module contributions and robustness.

Abstract

In this paper, we aim to transfer CLIP's robust 2D generalization capabilities to identify 3D anomalies across unseen objects of highly diverse class semantics. To this end, we propose a unified framework to comprehensively detect and segment 3D anomalies by leveraging both point- and pixel-level information. We first design PointAD, which leverages point-pixel correspondence to represent 3D anomalies through their associated rendering pixel representations. This approach is referred to as implicit 3D representation, as it focuses solely on rendering pixel anomalies but neglects the inherent spatial relationships within point clouds. Then, we propose PointAD+ to further broaden the interpretation of 3D anomalies by introducing explicit 3D representation, emphasizing spatial abnormality to uncover abnormal spatial relationships. Hence, we propose G-aggregation to involve geometry information to enable the aggregated point representations spatially aware. To simultaneously capture rendering and spatial abnormality, PointAD+ proposes hierarchical representation learning, incorporating implicit and explicit anomaly semantics into hierarchical text prompts: rendering prompts for the rendering layer and geometry prompts for the geometry layer. A cross-hierarchy contrastive alignment is further introduced to promote the interaction between the rendering and geometry layers, facilitating mutual anomaly learning. Finally, PointAD+ integrates anomaly semantics from both layers to capture the generalized anomaly semantics. During the test, PointAD+ can integrate RGB information in a plug-and-play manner and further improve its detection performance. Extensive experiments demonstrate the superiority of PointAD+ in ZS 3D anomaly detection across unseen objects with highly diverse class semantics, achieving a holistic understanding of abnormality.

PointAD+: Learning Hierarchical Representations for Zero-shot 3D Anomaly Detection

TL;DR

Abstract

PointAD+: Learning Hierarchical Representations for Zero-shot 3D Anomaly Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (20)