Table of Contents
Fetching ...

No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation

Xiangyang Zhu, Renrui Zhang, Bowei He, Ziyu Guo, Jiaming Liu, Han Xiao, Chaoyou Fu, Hao Dong, Peng Gao

TL;DR

This work tackles data-hungry 3D scene segmentation by proposing Seg-NN, a training-free non-parametric encoder that uses trigonometric positional encodings and hand-crafted low-frequency filters to generate per-point embeddings for few-shot segmentation. Building on Seg-NN, Seg-PN adds the QUEST module to refine class prototypes via query-support interaction, leveraging cross- and self-correlation to mitigate prototype bias without full pre-training. On S3DIS and ScanNet, Seg-PN achieves new state-of-the-art mIoU gains (+4.19% and +7.71%) while reducing training time by over 90% and using as few as 0.24M parameters, demonstrating strong efficiency and generalization. The approach effectively reduces the data and compute burdens in 3D few-shot segmentation and shows promising cross-dataset transferability, with potential applicability to broader 3D tasks via non-parametric encoders.

Abstract

To reduce the reliance on large-scale datasets, recent works in 3D segmentation resort to few-shot learning. Current 3D few-shot segmentation methods first pre-train models on 'seen' classes, and then evaluate their generalization performance on 'unseen' classes. However, the prior pre-training stage not only introduces excessive time overhead but also incurs a significant domain gap on 'unseen' classes. To tackle these issues, we propose a Non-parametric Network for few-shot 3D Segmentation, Seg-NN, and its Parametric variant, Seg-PN. Without training, Seg-NN extracts dense representations by hand-crafted filters and achieves comparable performance to existing parametric models. Due to the elimination of pre-training, Seg-NN can alleviate the domain gap issue and save a substantial amount of time. Based on Seg-NN, Seg-PN only requires training a lightweight QUEry-Support Transferring (QUEST) module, which enhances the interaction between the support set and query set. Experiments suggest that Seg-PN outperforms previous state-of-the-art method by +4.19% and +7.71% mIoU on S3DIS and ScanNet datasets respectively, while reducing training time by -90%, indicating its effectiveness and efficiency.

No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation

TL;DR

This work tackles data-hungry 3D scene segmentation by proposing Seg-NN, a training-free non-parametric encoder that uses trigonometric positional encodings and hand-crafted low-frequency filters to generate per-point embeddings for few-shot segmentation. Building on Seg-NN, Seg-PN adds the QUEST module to refine class prototypes via query-support interaction, leveraging cross- and self-correlation to mitigate prototype bias without full pre-training. On S3DIS and ScanNet, Seg-PN achieves new state-of-the-art mIoU gains (+4.19% and +7.71%) while reducing training time by over 90% and using as few as 0.24M parameters, demonstrating strong efficiency and generalization. The approach effectively reduces the data and compute burdens in 3D few-shot segmentation and shows promising cross-dataset transferability, with potential applicability to broader 3D tasks via non-parametric encoders.

Abstract

To reduce the reliance on large-scale datasets, recent works in 3D segmentation resort to few-shot learning. Current 3D few-shot segmentation methods first pre-train models on 'seen' classes, and then evaluate their generalization performance on 'unseen' classes. However, the prior pre-training stage not only introduces excessive time overhead but also incurs a significant domain gap on 'unseen' classes. To tackle these issues, we propose a Non-parametric Network for few-shot 3D Segmentation, Seg-NN, and its Parametric variant, Seg-PN. Without training, Seg-NN extracts dense representations by hand-crafted filters and achieves comparable performance to existing parametric models. Due to the elimination of pre-training, Seg-NN can alleviate the domain gap issue and save a substantial amount of time. Based on Seg-NN, Seg-PN only requires training a lightweight QUEry-Support Transferring (QUEST) module, which enhances the interaction between the support set and query set. Experiments suggest that Seg-PN outperforms previous state-of-the-art method by +4.19% and +7.71% mIoU on S3DIS and ScanNet datasets respectively, while reducing training time by -90%, indicating its effectiveness and efficiency.
Paper Structure (48 sections, 11 equations, 10 figures, 17 tables)

This paper contains 48 sections, 11 equations, 10 figures, 17 tables.

Figures (10)

  • Figure 1: Comparison of Existing Methods and Our Approaches. Our non-parametric Seg-NN contains no learnable parameters and thus discards both pre-training and episodic training stages with superior efficiency, and the parametric Seg-PN further improves the performance with a lightweight QUEST module.
  • Figure 2: Alleviating Domain Gap by Seg-NN (a) and Prototype Bias by Seg-PN (b) on S3DIS armeni20163d. The horizontal axis presents the number of ways and shots in the form of (Way, Shot).
  • Figure 3: The Framework of the Non-parametric Seg-NN. The encoder extracts support- and query-set features and the segmentation head segments the query set based on similarity. To facilitate illustration, we assume the encoder consists of three manipulation layers.
  • Figure 4: Examples of Position Encodings and Frequencies, where we set $d=20$. (a) and (b) are examples of initial encodings. (c) is the average frequency spectrum over all points' encodings of a point cloud. (d) is the distribution of sampled frequencies.
  • Figure 5: Details of QUEST in Seg-PN. QUEST finally outputs adjusted prototypes $\mathbf{F}^{P}{}^{*}$.
  • ...and 5 more figures