GeoMask3D: Geometrically Informed Mask Selection for Self-Supervised Point Cloud Learning in 3D
Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori, Milad Cheraghalikhani, Gustavo Adolfo Vargas Hakim, David Osowiechi, Farzad Beizaee, Ismail Ben Ayed, Christian Desrosiers
TL;DR
GeoMask3D (GM3D) addresses the inefficiency of random masking in self-supervised point-cloud learning by introducing geometry-guided patch masking powered by a teacher-student framework. It predicts patch geometric complexity (gc), employs a curriculum to progressively mask high-complexity regions, and integrates a knowledge-distillation pathway to align student features with a frozen teacher. The approach improves representations for Point-MAE and Point-M2AE, yielding stronger performance on ModelNet40, ScanObjectNN, and ShapeNetPart, while also accelerating pretraining convergence. By relying solely on 3D coordinates and geometric cues, GM3D advances 3D self-supervised learning without auxiliary modalities.
Abstract
We introduce a pioneering approach to self-supervised learning for point clouds, employing a geometrically informed mask selection strategy called GeoMask3D (GM3D) to boost the efficiency of Masked Auto Encoders (MAE). Unlike the conventional method of random masking, our technique utilizes a teacher-student model to focus on intricate areas within the data, guiding the model's focus toward regions with higher geometric complexity. This strategy is grounded in the hypothesis that concentrating on harder patches yields a more robust feature representation, as evidenced by the improved performance on downstream tasks. Our method also presents a complete-to-partial feature-level knowledge distillation technique designed to guide the prediction of geometric complexity utilizing a comprehensive context from feature-level information. Extensive experiments confirm our method's superiority over State-Of-The-Art (SOTA) baselines, demonstrating marked improvements in classification, and few-shot tasks.
