S2AM3D: Scale-controllable Part Segmentation of 3D Point Cloud
Han Su, Tianyu Huang, Zichen Wan, Xiaohe Wu, Wangmeng Zuo
TL;DR
3D part segmentation struggles with data scarcity and cross-view inconsistencies when using 2D priors.S2AM3D combines a point-consistent encoder trained with 3D contrastive supervision with a scale-aware prompt decoder that uses FiLM and bi-directional cross-attention to produce scale-controllable, per-point segmentations.It introduces a scalable data pipeline and a large, high-quality part-level dataset to supervise open-domain shapes.Experiments show state-of-the-art performance for interactive and full segmentation with strong robustness and real-time granularity control.
Abstract
Part-level point cloud segmentation has recently attracted significant attention in 3D computer vision. Nevertheless, existing research is constrained by two major challenges: native 3D models lack generalization due to data scarcity, while introducing 2D pre-trained knowledge often leads to inconsistent segmentation results across different views. To address these challenges, we propose S2AM3D, which incorporates 2D segmentation priors with 3D consistent supervision. We design a point-consistent part encoder that aggregates multi-view 2D features through native 3D contrastive learning, producing globally consistent point features. A scale-aware prompt decoder is then proposed to enable real-time adjustment of segmentation granularity via continuous scale signals. Simultaneously, we introduce a large-scale, high-quality part-level point cloud dataset with more than 100k samples, providing ample supervision signals for model training. Extensive experiments demonstrate that S2AM3D achieves leading performance across multiple evaluation settings, exhibiting exceptional robustness and controllability when handling complex structures and parts with significant size variations.
