Table of Contents
Fetching ...

Taylor Series-Inspired Local Structure Fitting Network for Few-shot Point Cloud Semantic Segmentation

Changshuo Wang, Shuting He, Xiang Fang, Meiqing Wu, Siew-Kei Lam, Prayag Tiwari

TL;DR

TaylorSeg introduces a pretraining-free approach to few-shot point cloud semantic segmentation by modeling local geometry as a polynomial fitting problem via TaylorConv, which combines low- and high-order information through LoConv and HiConv. The method comes in two variants: TaylorSeg-NN, a parameter-free baseline using trigonometric encodings, and TaylorSeg-PN, a learnable model augmented with an Adaptive Push-Pull (APP) module to align query and support distributions. Through extensive experiments on S3DIS and ScanNet, TaylorSeg-PN achieves state-of-the-art results in 2-way 1-shot settings, with notable gains over prior methods while maintaining efficiency. The work advances practical 3D recognition with limited labels by integrating a principled local-structure fit and adaptive feature alignment, reducing reliance on heavy pretraining.

Abstract

Few-shot point cloud semantic segmentation aims to accurately segment "unseen" new categories in point cloud scenes using limited labeled data. However, pretraining-based methods not only introduce excessive time overhead but also overlook the local structure representation among irregular point clouds. To address these issues, we propose a pretraining-free local structure fitting network for few-shot point cloud semantic segmentation, named TaylorSeg. Specifically, inspired by Taylor series, we treat the local structure representation of irregular point clouds as a polynomial fitting problem and propose a novel local structure fitting convolution, called TaylorConv. This convolution learns the low-order basic information and high-order refined information of point clouds from explicit encoding of local geometric structures. Then, using TaylorConv as the basic component, we construct two variants of TaylorSeg: a non-parametric TaylorSeg-NN and a parametric TaylorSeg-PN. The former can achieve performance comparable to existing parametric models without pretraining. For the latter, we equip it with an Adaptive Push-Pull (APP) module to mitigate the feature distribution differences between the query set and the support set. Extensive experiments validate the effectiveness of the proposed method. Notably, under the 2-way 1-shot setting, TaylorSeg-PN achieves improvements of +2.28% and +4.37% mIoU on the S3DIS and ScanNet datasets respectively, compared to the previous state-of-the-art methods. Our code is available at https://github.com/changshuowang/TaylorSeg.

Taylor Series-Inspired Local Structure Fitting Network for Few-shot Point Cloud Semantic Segmentation

TL;DR

TaylorSeg introduces a pretraining-free approach to few-shot point cloud semantic segmentation by modeling local geometry as a polynomial fitting problem via TaylorConv, which combines low- and high-order information through LoConv and HiConv. The method comes in two variants: TaylorSeg-NN, a parameter-free baseline using trigonometric encodings, and TaylorSeg-PN, a learnable model augmented with an Adaptive Push-Pull (APP) module to align query and support distributions. Through extensive experiments on S3DIS and ScanNet, TaylorSeg-PN achieves state-of-the-art results in 2-way 1-shot settings, with notable gains over prior methods while maintaining efficiency. The work advances practical 3D recognition with limited labels by integrating a principled local-structure fit and adaptive feature alignment, reducing reliance on heavy pretraining.

Abstract

Few-shot point cloud semantic segmentation aims to accurately segment "unseen" new categories in point cloud scenes using limited labeled data. However, pretraining-based methods not only introduce excessive time overhead but also overlook the local structure representation among irregular point clouds. To address these issues, we propose a pretraining-free local structure fitting network for few-shot point cloud semantic segmentation, named TaylorSeg. Specifically, inspired by Taylor series, we treat the local structure representation of irregular point clouds as a polynomial fitting problem and propose a novel local structure fitting convolution, called TaylorConv. This convolution learns the low-order basic information and high-order refined information of point clouds from explicit encoding of local geometric structures. Then, using TaylorConv as the basic component, we construct two variants of TaylorSeg: a non-parametric TaylorSeg-NN and a parametric TaylorSeg-PN. The former can achieve performance comparable to existing parametric models without pretraining. For the latter, we equip it with an Adaptive Push-Pull (APP) module to mitigate the feature distribution differences between the query set and the support set. Extensive experiments validate the effectiveness of the proposed method. Notably, under the 2-way 1-shot setting, TaylorSeg-PN achieves improvements of +2.28% and +4.37% mIoU on the S3DIS and ScanNet datasets respectively, compared to the previous state-of-the-art methods. Our code is available at https://github.com/changshuowang/TaylorSeg.

Paper Structure

This paper contains 21 sections, 16 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Top: Most existing methods are based on fine-tuning a pre-trained DGCNN, followed by using query features to guide and align the prototype features. This strategy is not only time-consuming but also overlooks the importance of local structure representation. Bottom: We propose a new backbone for point cloud tasks that requires no pre-training and possesses strong local structure representation capabilities. Additionally, we design an APP module that effectively aligns query features with prototype features.
  • Figure 2: (a) The overall architecture of TaylorSeg. It centers around TayConv, a locally feature extraction module inspired by the Taylor series. TayConv forms the Taylor Block when combined with the FPS operation, and stacking these blocks along with upsampling operations constitutes our backbone network. TaylorSeg has two variants: for TaylorSeg-NN, the APP module is replaced with masked average pooling, allowing direct testing without training. For TaylorSeg-PN, both the TaylorConv and APP modules are optimized during training before testing. (b) The data flow diagram of the APP module. It is primarily designed for TaylorSeg-PN. It is not only parameter-efficient but also significantly reduces the feature distribution discrepancy between the query and support sets. Additionally, it is plug-and-play, making it compatible with many few-shot methods.
  • Figure 3: Ablation for Number of Encoder Layers in 2-way-1-shot setting on the S3DIS dataset.