White Aggregation and Restoration for Few-shot 3D Point Cloud Semantic Segmentation
Authors
Jiyun Im, SuBeen Lee, Miso Lee, Jae-Pil Heo
Abstract
Few-Shot 3D Point Cloud Semantic Segmentation (FS-PCS) aims to predict per-point labels for an unlabeled point cloud, given only a few labeled examples. To extract discriminative representations from the limited labeled set, existing methods have constructed prototypes using algorithms such as farthest point sampling (FPS). However, we point out that this convention has undesirable effects as performance fluctuates depending on sampling, while the prototype generation process remains underexplored in the field. This motivates us to investigate an advanced prototype generation method based on attention mechanism. Despite its potential, we found that vanilla attention module suffers from the distributional gap between prototypical tokens and support features. To overcome this, we propose White Aggregation and Restoration Module (WARM), which resolves the misalignment by sandwiching cross-attention between whitening and coloring transformations. Specifically, whitening aligns the features to tokens before the attention process, and coloring subsequently restores the original distribution to the attended tokens. This simple yet effective design enables robust attention, thereby generating prototypes that capture the semantic relationships in support features. WARM achieves state-of-the-art performance with a significant margin on FS-PCS benchmarks, and demonstrates its effectiveness through extensive experiments.