Table of Contents
Fetching ...

Hypergraph Convolutional Network based Weakly Supervised Point Cloud Semantic Segmentation with Scene-Level Annotations

Zhuheng Lu, Peng Zhang, Yuewei Dai, Weiqing Li, Zhiyong Su

TL;DR

This work tackles weakly supervised 3D point cloud semantic segmentation using scene-level annotations. It introduces WHCN, a weighted hypergraph convolution network, built on a hypergraph whose vertices are geometrically homogeneous superpoints and whose hyperedges connect class-specific seeds derived from CAM, enabling effective label propagation to generate high-quality pseudo labels. The approach combines superpoint generation, CAM-based seed labeling, and a hyperedge-attention enabled spectral hypergraph convolution, achieving state-of-the-art results on ScanNet and S3DIS under scene-level supervision. WHCN addresses point imbalance and sparse CAM issues, enabling accurate segmentation with reduced annotation cost and offering potential for broader annotation schemes beyond scene-level labels.

Abstract

Point cloud segmentation with scene-level annotations is a promising but challenging task. Currently, the most popular way is to employ the class activation map (CAM) to locate discriminative regions and then generate point-level pseudo labels from scene-level annotations. However, these methods always suffer from the point imbalance among categories, as well as the sparse and incomplete supervision from CAM. In this paper, we propose a novel weighted hypergraph convolutional network-based method, called WHCN, to confront the challenges of learning point-wise labels from scene-level annotations. Firstly, in order to simultaneously overcome the point imbalance among different categories and reduce the model complexity, superpoints of a training point cloud are generated by exploiting the geometrically homogeneous partition. Then, a hypergraph is constructed based on the high-confidence superpoint-level seeds which are converted from scene-level annotations. Secondly, the WHCN takes the hypergraph as input and learns to predict high-precision point-level pseudo labels by label propagation. Besides the backbone network consisting of spectral hypergraph convolution blocks, a hyperedge attention module is learned to adjust the weights of hyperedges in the WHCN. Finally, a segmentation network is trained by these pseudo point cloud labels. We comprehensively conduct experiments on the ScanNet and S3DIS segmentation datasets. Experimental results demonstrate that the proposed WHCN is effective to predict the point labels with scene annotations, and yields state-of-the-art results in the community. The source code is available at http://zhiyongsu.github.io/Project/WHCN.html.

Hypergraph Convolutional Network based Weakly Supervised Point Cloud Semantic Segmentation with Scene-Level Annotations

TL;DR

This work tackles weakly supervised 3D point cloud semantic segmentation using scene-level annotations. It introduces WHCN, a weighted hypergraph convolution network, built on a hypergraph whose vertices are geometrically homogeneous superpoints and whose hyperedges connect class-specific seeds derived from CAM, enabling effective label propagation to generate high-quality pseudo labels. The approach combines superpoint generation, CAM-based seed labeling, and a hyperedge-attention enabled spectral hypergraph convolution, achieving state-of-the-art results on ScanNet and S3DIS under scene-level supervision. WHCN addresses point imbalance and sparse CAM issues, enabling accurate segmentation with reduced annotation cost and offering potential for broader annotation schemes beyond scene-level labels.

Abstract

Point cloud segmentation with scene-level annotations is a promising but challenging task. Currently, the most popular way is to employ the class activation map (CAM) to locate discriminative regions and then generate point-level pseudo labels from scene-level annotations. However, these methods always suffer from the point imbalance among categories, as well as the sparse and incomplete supervision from CAM. In this paper, we propose a novel weighted hypergraph convolutional network-based method, called WHCN, to confront the challenges of learning point-wise labels from scene-level annotations. Firstly, in order to simultaneously overcome the point imbalance among different categories and reduce the model complexity, superpoints of a training point cloud are generated by exploiting the geometrically homogeneous partition. Then, a hypergraph is constructed based on the high-confidence superpoint-level seeds which are converted from scene-level annotations. Secondly, the WHCN takes the hypergraph as input and learns to predict high-precision point-level pseudo labels by label propagation. Besides the backbone network consisting of spectral hypergraph convolution blocks, a hyperedge attention module is learned to adjust the weights of hyperedges in the WHCN. Finally, a segmentation network is trained by these pseudo point cloud labels. We comprehensively conduct experiments on the ScanNet and S3DIS segmentation datasets. Experimental results demonstrate that the proposed WHCN is effective to predict the point labels with scene annotations, and yields state-of-the-art results in the community. The source code is available at http://zhiyongsu.github.io/Project/WHCN.html.
Paper Structure (22 sections, 14 equations, 6 figures, 6 tables)

This paper contains 22 sections, 14 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: The difference between (a) a graph and a (b) hypergraph. In a graph, each edge, represented by a line, only connects two vertices. In a hypergraph, each edge, represented by a colored ellipse, connects more than two vertices.
  • Figure 2: Overview of our proposed WHCN framework. First, superpoints are generated from the input point clouds. Meanwhile, scene-level annotations are transferred to superpoint-level labels. The hypergraph is constructed based on the generated superpoints. Then, pseudo labels are learned from the proposed weighted hypergraph convolution network.
  • Figure 3: Overview of the superpoint seed label generation. We input the generated superpoints and the corresponding scene labels. Then, we train a classification network to generate the class activation maps. Meanwhile, we select confident superpoint-level seed labels as the vertex labels. The labels of white vertices are unknown.
  • Figure 4: Illustration of the hypergraph convolution layer. The initial vertex feature is transformed by a learnable matrix $\mathbf{\Theta}$. Then, the new vertex feature on the hyperedges is gathered to obtain the hyperedge feature by $\mathbf{{H}}^{\intercal}$. Finally, the related hyperedge feature is associated to obtain the final vertex feature using the matrix $\mathbf{{H}}$.
  • Figure 5: Qualitative results on the ScanNet dataset of our WHCN. The top row is the original input point clouds. The middle row is the ground truth. The bottom row shows the segmentation results with WHCN. The bottom row shows the segmentation results with WHCN. Note that the black points in the ground truth indicate unclassified points, which are ignored during evaluation.
  • ...and 1 more figures