Table of Contents
Fetching ...

Rethinking Few-shot 3D Point Cloud Semantic Segmentation

Zhaochong An, Guolei Sun, Yun Liu, Fayao Liu, Zongwei Wu, Dan Wang, Luc Van Gool, Serge Belongie

TL;DR

This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS), with a focus on two significant is-sues in the state-of-the-art: foreground leakage and sparse point distribution, and proposes a novel FS-PCS model based on correlation optimization, referred to as Correlation Optimization Segmentation (COSeg).

Abstract

This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS), with a focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution. The former arises from non-uniform point sampling, allowing models to distinguish the density disparities between foreground and background for easier segmentation. The latter results from sampling only 2,048 points, limiting semantic information and deviating from the real-world practice. To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built. Moreover, we propose a novel FS-PCS model. While previous methods are based on feature optimization by mainly refining support features to enhance prototypes, our method is based on correlation optimization, referred to as Correlation Optimization Segmentation (COSeg). Specifically, we compute Class-specific Multi-prototypical Correlation (CMC) for each query point, representing its correlations to category prototypes. Then, we propose the Hyper Correlation Augmentation (HCA) module to enhance CMC. Furthermore, tackling the inherent property of few-shot training to incur base susceptibility for models, we propose to learn non-parametric prototypes for the base classes during training. The learned base prototypes are used to calibrate correlations for the background class through a Base Prototypes Calibration (BPC) module. Experiments on popular datasets demonstrate the superiority of COSeg over existing methods. The code is available at: https://github.com/ZhaochongAn/COSeg

Rethinking Few-shot 3D Point Cloud Semantic Segmentation

TL;DR

This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS), with a focus on two significant is-sues in the state-of-the-art: foreground leakage and sparse point distribution, and proposes a novel FS-PCS model based on correlation optimization, referred to as Correlation Optimization Segmentation (COSeg).

Abstract

This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS), with a focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution. The former arises from non-uniform point sampling, allowing models to distinguish the density disparities between foreground and background for easier segmentation. The latter results from sampling only 2,048 points, limiting semantic information and deviating from the real-world practice. To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built. Moreover, we propose a novel FS-PCS model. While previous methods are based on feature optimization by mainly refining support features to enhance prototypes, our method is based on correlation optimization, referred to as Correlation Optimization Segmentation (COSeg). Specifically, we compute Class-specific Multi-prototypical Correlation (CMC) for each query point, representing its correlations to category prototypes. Then, we propose the Hyper Correlation Augmentation (HCA) module to enhance CMC. Furthermore, tackling the inherent property of few-shot training to incur base susceptibility for models, we propose to learn non-parametric prototypes for the base classes during training. The learned base prototypes are used to calibrate correlations for the background class through a Base Prototypes Calibration (BPC) module. Experiments on popular datasets demonstrate the superiority of COSeg over existing methods. The code is available at: https://github.com/ZhaochongAn/COSeg
Paper Structure (16 sections, 13 equations, 8 figures, 6 tables, 1 algorithm)

This paper contains 16 sections, 13 equations, 8 figures, 6 tables, 1 algorithm.

Figures (8)

  • Figure 1: Previous feature optimizationvs. our correlation optimization.Top: Most prior work zhao2021fewhe2023prototypening2023boostingzhu2023crossmao2022bidirectionalwang2023few on FS-PCS focuses on feature optimization by designing support adaption modules for enhanced prototypes and then making predictions through non-parametric label propagation (LBP) or cosine similarity (COS), implicitly modeling correlations. Bottom: Instead of optimizing features, we propose to directly uses correlations as input to learnable modules, explicitly refining correlations.
  • Figure 2: Visualization of two scenes from the S3DIS dataset armeni20163d, with the foreground class as door and board for 1-way segmentation, respectively. Each scene includes six types of point clouds, arranged from left to right: (1) The original point cloud; (2) Ground truth of all categories; (3) Our corrected input with 20,480 points in a uniform distribution; (4) Input with 20,480 points in a biased distribution; (5) Input with 2,048 points in a uniform distribution; (6) Input with 2,048 points in a biased distribution, as adopted by previous works.
  • Figure 3: Overall architecture of the proposed COSeg. Initially, we compute CMC for each query point using the backbone features. These correlations are then forwarded to the subsequent HCA module, which actively mines hyper-relations among correlations across points and classes. Additionally, we dynamically learn non-parametric base prototypes on the fly and introduce the BPC module to effectively alleviate the base susceptibility problem. For clarity, we present the model under the 1-way 1-shot setting.
  • Figure 4: Qualitative comparisons between our proposed model COSeg and QGE ning2023boosting. Each row, from top to bottom, represents the $1$-way $1$-shot task with the target category as floor (blue), chair (red), and table (purple), respectively.
  • Figure 5: Visual comparisons between our models with BPC (w/ BPC) and without BPC (w/o BPC). Each row corresponds to the $1$-way $1$-shot task targeting bookcase (green) and chair (red), respectively, arranged from top to bottom.
  • ...and 3 more figures