Table of Contents
Fetching ...

Label-Efficient Point Cloud Segmentation with Active Learning

Johannes Meyer, Jasper Hoffmann, Felix Schulz, Dominik Merkle, Daniel Buescher, Alexander Reiterer, Joschka Boedecker, Wolfram Burgard

TL;DR

This work tackles the high labeling cost in 3D point-cloud semantic segmentation by introducing a lightweight active learning pipeline that partitions scenes into easily annotatable 2D grid columns and selects regions using ensemble uncertainty (Entropy and VaR). It formally measures annotation effort with an area-based metric, and demonstrates competitive or superior performance to state-of-the-art region-based AL methods across S3DIS, Toronto-3D, and Freiburg datasets. The approach reduces preprocessing complexity, maintains strong performance, and reveals that annotated area can be a more meaningful efficiency proxy than the number of annotated points. These findings suggest practical gains for deploying AL in large-scale urban point clouds and highlight avenues for integrating richer augmentations and scalable region resolutions.

Abstract

Semantic segmentation of 3D point cloud data often comes with high annotation costs. Active learning automates the process of selecting which data to annotate, reducing the total amount of annotation needed to achieve satisfactory performance. Recent approaches to active learning for 3D point clouds are often based on sophisticated heuristics for both, splitting point clouds into annotatable regions and selecting the most beneficial for further neural network training. In this work, we propose a novel and easy-to-implement strategy to separate the point cloud into annotatable regions. In our approach, we utilize a 2D grid to subdivide the point cloud into columns. To identify the next data to be annotated, we employ a network ensemble to estimate the uncertainty in the network output. We evaluate our method on the S3DIS dataset, the Toronto-3D dataset, and a large-scale urban 3D point cloud of the city of Freiburg, which we labeled in parts manually. The extensive evaluation shows that our method yields performance on par with, or even better than, complex state-of-the-art methods on all datasets. Furthermore, we provide results suggesting that in the context of point clouds the annotated area can be a more meaningful measure for active learning algorithms than the number of annotated points.

Label-Efficient Point Cloud Segmentation with Active Learning

TL;DR

This work tackles the high labeling cost in 3D point-cloud semantic segmentation by introducing a lightweight active learning pipeline that partitions scenes into easily annotatable 2D grid columns and selects regions using ensemble uncertainty (Entropy and VaR). It formally measures annotation effort with an area-based metric, and demonstrates competitive or superior performance to state-of-the-art region-based AL methods across S3DIS, Toronto-3D, and Freiburg datasets. The approach reduces preprocessing complexity, maintains strong performance, and reveals that annotated area can be a more meaningful efficiency proxy than the number of annotated points. These findings suggest practical gains for deploying AL in large-scale urban point clouds and highlight avenues for integrating richer augmentations and scalable region resolutions.

Abstract

Semantic segmentation of 3D point cloud data often comes with high annotation costs. Active learning automates the process of selecting which data to annotate, reducing the total amount of annotation needed to achieve satisfactory performance. Recent approaches to active learning for 3D point clouds are often based on sophisticated heuristics for both, splitting point clouds into annotatable regions and selecting the most beneficial for further neural network training. In this work, we propose a novel and easy-to-implement strategy to separate the point cloud into annotatable regions. In our approach, we utilize a 2D grid to subdivide the point cloud into columns. To identify the next data to be annotated, we employ a network ensemble to estimate the uncertainty in the network output. We evaluate our method on the S3DIS dataset, the Toronto-3D dataset, and a large-scale urban 3D point cloud of the city of Freiburg, which we labeled in parts manually. The extensive evaluation shows that our method yields performance on par with, or even better than, complex state-of-the-art methods on all datasets. Furthermore, we provide results suggesting that in the context of point clouds the annotated area can be a more meaningful measure for active learning algorithms than the number of annotated points.

Paper Structure

This paper contains 26 sections, 5 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: The goal of this work is to reduce the annotation cost of semantic segmentation for unlabeled urban point clouds. By simplifying existing methods, we aim to reduce the entry barrier to apply active learning for point clouds.
  • Figure 2: Our proposed active learning pipeline. The initial dataset consists of unlabeled and labeled parts. The AL algorithm first separates the point cloud into columns and then selects regions with the highest ensemble entropy. These are presented to a human expert for extending the labeled dataset. We iteratively repeat the procedure.
  • Figure 3: VaR (left) and ensemble entropy (right) for the Freiburg dataset. Green corresponds to small, and white to large uncertainty. This image shows that both uncertainty metrics indicate a high uncertainty in the areas which are lower-vegetation. In contrast, the entropy indicates a higher uncertainty in the upper parts of the car than the VaR.
  • Figure 4: Region separation on the Toronto data with VCCS (left) and HDBScan (right). Each set of points (supervoxels) is drawn in a different color. The very noisy representation of the clusters on the left depicts the failure of VCCS on this dataset. In contrast on the right the combination of HDBScan with ground-plane removal gives very concise clusters.
  • Figure 5: Performance as a function of the annotated area for all datasets. The blue lines correspond to our proposed column separation, the orange line correspond to the region separation with VCCS and the green line correspond the the HDBScan-based separation.
  • ...and 3 more figures