Joint Learning for Scattered Point Cloud Understanding with Hierarchical Self-Distillation
Kaiyue Zhou, Ming Dong, Peiyuan Zhi, Shengjin Wang
TL;DR
The paper addresses the vulnerability of point-cloud understanding to incomplete scans by proposing an end-to-end cascaded framework that combines an upstream masked autoencoder (MAE) with a downstream hierarchy-based classifier. Central to the approach is hierarchical self-distillation (HSD), which reinforces multi-scale features by transferring information from the deepest layer to earlier branches while maximizing mutual information, explained through an information-bottleneck perspective. The authors formulate a joint learning objective that simultaneously reconstructs incomplete data and performs classification or segmentation, with a plug-and-play downstream backbone. Empirical results on ModelNet40, ScanObjectNN, and ShapeNetPart show state-of-the-art performance for scattered point clouds, improved robustness to sparsity, and strong regularization effects from HSD. This work advances practical 3D understanding under realistic, imperfect sensing conditions and provides a flexible framework for integrating reconstruction and recognition tasks.
Abstract
Numerous point-cloud understanding techniques focus on whole entities and have succeeded in obtaining satisfactory results and limited sparsity tolerance. However, these methods are generally sensitive to incomplete point clouds that are scanned with flaws or large gaps. In this paper, we propose an end-to-end architecture that compensates for and identifies partial point clouds on the fly. First, we propose a cascaded solution that integrates both the upstream masked autoencoder (MAE) and downstream understanding networks simultaneously, allowing the task-oriented downstream to identify the points generated by the completion-oriented upstream. These two streams complement each other, resulting in improved performance for both completion and downstream-dependent tasks. Second, to explicitly understand the predicted points' pattern, we introduce hierarchical self-distillation (HSD), which can be applied to any hierarchy-based point cloud methods. HSD ensures that the deepest classifier with a larger perceptual field of local kernels and longer code length provides additional regularization to intermediate ones rather than simply aggregating the multi-scale features, and therefore maximizing the mutual information (MI) between a teacher and students. The proposed HSD strategy is particularly well-suited for tasks involving scattered point clouds, wherein a singular prediction may yield imprecise outcomes due to the inherently irregular and sparse nature of the geometric shape being reconstructed. We show the advantage of the self-distillation process in the hyperspaces based on the information bottleneck principle. Our method achieves state-of-the-art on both classification and part segmentation tasks.
