Table of Contents
Fetching ...

BaSAL: Size-Balanced Warm Start Active Learning for LiDAR Semantic Segmentation

Jiarong Wei, Yancong Lin, Holger Caesar

TL;DR

BaSAL introduces a size-balanced warm-start active learning framework for LiDAR semantic segmentation to tackle class imbalance and cold-start challenges. It uses size-based adaptive binning to create partitions and initial warm-start sampling, followed by information-measure-driven sampling that combines Softmax Entropy and CoreSet-inspired feature diversity. The approach yields large gains at low annotation budgets, achieving near full-supervision performance on SemanticKITTI with only $5\%$ labeled data, and competitive results on nuScenes. This method reduces labeling effort while improving performance on rare classes, with strong practical implications for scalable 3D perception in autonomous systems.

Abstract

Active learning strives to reduce the need for costly data annotation, by repeatedly querying an annotator to label the most informative samples from a pool of unlabeled data, and then training a model from these samples. We identify two problems with existing active learning methods for LiDAR semantic segmentation. First, they overlook the severe class imbalance inherent in LiDAR semantic segmentation datasets. Second, to bootstrap the active learning loop when there is no labeled data available, they train their initial model from randomly selected data samples, leading to low performance. This situation is referred to as the cold start problem. To address these problems we propose BaSAL, a size-balanced warm start active learning model, based on the observation that each object class has a characteristic size. By sampling object clusters according to their size, we can thus create a size-balanced dataset that is also more class-balanced. Furthermore, in contrast to existing information measures like entropy or CoreSet, size-based sampling does not require a pretrained model, thus addressing the cold start problem effectively. Results show that we are able to improve the performance of the initial model by a large margin. Combining warm start and size-balanced sampling with established information measures, our approach achieves comparable performance to training on the entire SemanticKITTI dataset, despite using only 5% of the annotations, outperforming existing active learning methods. We also match the existing state-of-the-art in active learning on nuScenes. Our code is available at: https://github.com/Tony-WJR/BaSAL.

BaSAL: Size-Balanced Warm Start Active Learning for LiDAR Semantic Segmentation

TL;DR

BaSAL introduces a size-balanced warm-start active learning framework for LiDAR semantic segmentation to tackle class imbalance and cold-start challenges. It uses size-based adaptive binning to create partitions and initial warm-start sampling, followed by information-measure-driven sampling that combines Softmax Entropy and CoreSet-inspired feature diversity. The approach yields large gains at low annotation budgets, achieving near full-supervision performance on SemanticKITTI with only labeled data, and competitive results on nuScenes. This method reduces labeling effort while improving performance on rare classes, with strong practical implications for scalable 3D perception in autonomous systems.

Abstract

Active learning strives to reduce the need for costly data annotation, by repeatedly querying an annotator to label the most informative samples from a pool of unlabeled data, and then training a model from these samples. We identify two problems with existing active learning methods for LiDAR semantic segmentation. First, they overlook the severe class imbalance inherent in LiDAR semantic segmentation datasets. Second, to bootstrap the active learning loop when there is no labeled data available, they train their initial model from randomly selected data samples, leading to low performance. This situation is referred to as the cold start problem. To address these problems we propose BaSAL, a size-balanced warm start active learning model, based on the observation that each object class has a characteristic size. By sampling object clusters according to their size, we can thus create a size-balanced dataset that is also more class-balanced. Furthermore, in contrast to existing information measures like entropy or CoreSet, size-based sampling does not require a pretrained model, thus addressing the cold start problem effectively. Results show that we are able to improve the performance of the initial model by a large margin. Combining warm start and size-balanced sampling with established information measures, our approach achieves comparable performance to training on the entire SemanticKITTI dataset, despite using only 5% of the annotations, outperforming existing active learning methods. We also match the existing state-of-the-art in active learning on nuScenes. Our code is available at: https://github.com/Tony-WJR/BaSAL.
Paper Structure (31 sections, 4 equations, 5 figures, 7 tables)

This paper contains 31 sections, 4 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Overview of BaSAL. Our framework consists of a preprocessing step and the active learning loop. Size-balanced sampling is used to determine both the warm start set for the initialization of the active learning loop and the selected set for subsequent active learning iterations of the active learning loop.
  • Figure 2: Adaptive binning over sizes. We split all size groups into 3 partitions, as indicated by colors. The accumulated API from each partition is approximately the same. We balance object classes by creating a size-based partition.
  • Figure 3: Experiment results of different active learning strategies on SemanticKITTI SemanticKitti, nuScenes nuScene using SPVCNN searching, Minkowski minkowski network. We compare BaSAL with other existing works. The solid line is the performance of the fully supervised model. The dashed line indicates 95% performance of the fully supervised model. Our model outperforms all existing active learning approaches on SemanticKITTI and gets on par performance with the state-of-the-art active learning method LiDAL lidal on nuScenes.
  • Figure 4: Qualitative comparison on SemanticKITTI SemanticKitti using the Minkowski minkowski backbone. We visualize semantic segmentation results on two examples. Our model successfully detects the bicycle in (d), as indicated by the red box. In comparison, other models misclassify the bicycle as sidewalks in (b) and (c). In the second example, our model better detects the motorcycle in (h), while ReDAL redal over-segments it. We improve the performance on underrepresented classes.
  • Figure 5: Visualization of the object clusters and the supervoxels. Points with the same color belong to one supervoxel (cluster). Red bounding boxes indicate ground truth object clusters.