Table of Contents
Fetching ...

Dense Supervision Propagation for Weakly Supervised Semantic Segmentation on 3D Point Clouds

Jiacheng Wei, Guosheng Lin, Kim-Hui Yap, Fayao Liu, Tzu-Yi Hung

TL;DR

This work tackles 3D point cloud semantic segmentation under weak supervision by densely propagating sparse labels through cross-sample and intra-sample feature reallocating. It introduces a two-stage training framework: first, Cross-Sample Feature Reallocating (CSFR) transfers features and gradients across input pairs, then Intra-Sample Feature Reallocating (ISFR) propagates supervision within each sample, with corresponding cross- and self-regularization losses. By leveraging a KPConv-based backbone and decoupling the two modules during training, the method achieves performance close to fully supervised baselines on S3DIS and ScanNet using only 10% or 1% labeled points. The approach significantly reduces annotation costs while maintaining practical accuracy, offering a scalable path for dense 3D segmentation in real-world scenes.

Abstract

Semantic segmentation on 3D point clouds is an important task for 3D scene understanding. While dense labeling on 3D data is expensive and time-consuming, only a few works address weakly supervised semantic point cloud segmentation methods to relieve the labeling cost by learning from simpler and cheaper labels. Meanwhile, there are still huge performance gaps between existing weakly supervised methods and state-of-the-art fully supervised methods. In this paper, we train a semantic point cloud segmentation network with only a small portion of points being labeled. We argue that we can better utilize the limited supervision information as we densely propagate the supervision signal from the labeled points to other points within and across the input samples. Specifically, we propose a cross-sample feature reallocating module to transfer similar features and therefore re-route the gradients across two samples with common classes and an intra-sample feature redistribution module to propagate supervision signals on unlabeled points across and within point cloud samples. We conduct extensive experiments on public datasets S3DIS and ScanNet. Our weakly supervised method with only 10% and 1% of labels can produce compatible results with the fully supervised counterpart.

Dense Supervision Propagation for Weakly Supervised Semantic Segmentation on 3D Point Clouds

TL;DR

This work tackles 3D point cloud semantic segmentation under weak supervision by densely propagating sparse labels through cross-sample and intra-sample feature reallocating. It introduces a two-stage training framework: first, Cross-Sample Feature Reallocating (CSFR) transfers features and gradients across input pairs, then Intra-Sample Feature Reallocating (ISFR) propagates supervision within each sample, with corresponding cross- and self-regularization losses. By leveraging a KPConv-based backbone and decoupling the two modules during training, the method achieves performance close to fully supervised baselines on S3DIS and ScanNet using only 10% or 1% labeled points. The approach significantly reduces annotation costs while maintaining practical accuracy, offering a scalable path for dense 3D segmentation in real-world scenes.

Abstract

Semantic segmentation on 3D point clouds is an important task for 3D scene understanding. While dense labeling on 3D data is expensive and time-consuming, only a few works address weakly supervised semantic point cloud segmentation methods to relieve the labeling cost by learning from simpler and cheaper labels. Meanwhile, there are still huge performance gaps between existing weakly supervised methods and state-of-the-art fully supervised methods. In this paper, we train a semantic point cloud segmentation network with only a small portion of points being labeled. We argue that we can better utilize the limited supervision information as we densely propagate the supervision signal from the labeled points to other points within and across the input samples. Specifically, we propose a cross-sample feature reallocating module to transfer similar features and therefore re-route the gradients across two samples with common classes and an intra-sample feature redistribution module to propagate supervision signals on unlabeled points across and within point cloud samples. We conduct extensive experiments on public datasets S3DIS and ScanNet. Our weakly supervised method with only 10% and 1% of labels can produce compatible results with the fully supervised counterpart.

Paper Structure

This paper contains 19 sections, 12 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: (a) Supervision propagation across samples via cross-sample feature reallocating. For clarity, this figure shows only a single direction of the feature reallocating and supervision propagation, the module can propagate supervision bidirectionally. (b) An illustration of our weak labels. (c) Supervision propagation within samples via intra-sample feature reallocating.
  • Figure 2: The overall two-stage training framework for our proposed method. In Stage 1, a pair of point cloud samples are input into the encoder, yielding encoded features $F_i$ and $F_j$. A bespoke cross-sample feature reallocation module is employed to redistribute these features and direct gradient flows, thereby facilitating the propagation of supervisory signals between the pair. This results in the formation of transformed features $F_j^c$ and $F_i^c$. Subsequently, in Stage 2, individual point cloud samples are independently processed by the network to produce the encoded feature $F_m$. Here, an intra-sample feature propagation module is designed to internalize supervisory signals, obtaining the modified feature $F_m^s$. Throughout both stages, the encoded and warped features are conveyed to the decoder to generate the final predictions. It is important to note that the encoder and decoder are consistent across both stages, with the encoder and decoder parameters from Stage 1 being transferred to Stage 2.
  • Figure 3: This blue dotted line illustrates how the weak supervision from sample $i$ is propagated to sample $j$ based on point correlation. For clarity, this figure shows only a single direction of the feature reallocating and supervision propagation, the module can propagate supervision bidirectionally.
  • Figure 4: The visualization for the affinity calculated by (a) CSFR module, (b) ISFR module. The colors on the point cloud show the corresponding affinity of each point to the point indicated by the red dot. Pink means higher similarity and green means lower similarity.
  • Figure 5: Visualizations on S3DIS Area-5. The results are, left to right, input RGB point cloud, ground truth, fully supervised method, and our proposed method with 10% labels.
  • ...and 1 more figures