Proto-FG3D: Prototype-based Interpretable Fine-Grained 3D Shape Classification

Shuxian Ma; Zihao Dong; Runmin Cong; Sam Kwong; Xiuli Shao

Proto-FG3D: Prototype-based Interpretable Fine-Grained 3D Shape Classification

Shuxian Ma, Zihao Dong, Runmin Cong, Sam Kwong, Xiuli Shao

TL;DR

Proto-FG3D tackles fine-grained 3D shape classification by shifting from parametric softmax to non-parametric prototypes. It projects 3D shapes into multi-view 2D images, encodes them with a shared backbone, and learns a class-specific prototype pool through Prototype Association and online clustering, updated via EMA. Training optimizes intra-class prototype alignment and inter-prototype separation using a combination of cross-entropy and view-prototype contrastive losses, while inference relies on nearest prototype matching for transparent decisions. Experiments on FG3D and ModelNet40 demonstrate state-of-the-art accuracy, improved robustness to class imbalance, and built-in interpretability via global prototypes and local view-level explanations.

Abstract

Deep learning-based multi-view coarse-grained 3D shape classification has achieved remarkable success over the past decade, leveraging the powerful feature learning capabilities of CNN-based and ViT-based backbones. However, as a challenging research area critical for detailed shape understanding, fine-grained 3D classification remains understudied due to the limited discriminative information captured during multi-view feature aggregation, particularly for subtle inter-class variations, class imbalance, and inherent interpretability limitations of parametric model. To address these problems, we propose the first prototype-based framework named Proto-FG3D for fine-grained 3D shape classification, achieving a paradigm shift from parametric softmax to non-parametric prototype learning. Firstly, Proto-FG3D establishes joint multi-view and multi-category representation learning via Prototype Association. Secondly, prototypes are refined via Online Clustering, improving both the robustness of multi-view feature allocation and inter-subclass balance. Finally, prototype-guided supervised learning is established to enhance fine-grained discrimination via prototype-view correlation analysis and enables ad-hoc interpretability through transparent case-based reasoning. Experiments on FG3D and ModelNet40 show Proto-FG3D surpasses state-of-the-art methods in accuracy, transparent predictions, and ad-hoc interpretability with visualizations, challenging conventional fine-grained 3D recognition approaches.

Proto-FG3D: Prototype-based Interpretable Fine-Grained 3D Shape Classification

TL;DR

Abstract

Proto-FG3D: Prototype-based Interpretable Fine-Grained 3D Shape Classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)