Table of Contents
Fetching ...

Distribution Discrepancy and Feature Heterogeneity for Active 3D Object Detection

Huang-Yu Chen, Jia-Fong Yeh, Jia-Wei Liao, Pin-Hsuan Peng, Winston H. Hsu

TL;DR

A novel and effective active learning method called Distribution Discrepancy and Feature Heterogeneity (DDFH), which simultaneously considers geometric features and model embeddings, assessing information from both the instance-level and frame-level perspectives, enabling the model to learn efficiently with limited data.

Abstract

LiDAR-based 3D object detection is a critical technology for the development of autonomous driving and robotics. However, the high cost of data annotation limits its advancement. We propose a novel and effective active learning (AL) method called Distribution Discrepancy and Feature Heterogeneity (DDFH), which simultaneously considers geometric features and model embeddings, assessing information from both the instance-level and frame-level perspectives. Distribution Discrepancy evaluates the difference and novelty of instances within the unlabeled and labeled distributions, enabling the model to learn efficiently with limited data. Feature Heterogeneity ensures the heterogeneity of intra-frame instance features, maintaining feature diversity while avoiding redundant or similar instances, thus minimizing annotation costs. Finally, multiple indicators are efficiently aggregated using Quantile Transform, providing a unified measure of informativeness. Extensive experiments demonstrate that DDFH outperforms the current state-of-the-art (SOTA) methods on the KITTI and Waymo datasets, effectively reducing the bounding box annotation cost by 56.3% and showing robustness when working with both one-stage and two-stage models.

Distribution Discrepancy and Feature Heterogeneity for Active 3D Object Detection

TL;DR

A novel and effective active learning method called Distribution Discrepancy and Feature Heterogeneity (DDFH), which simultaneously considers geometric features and model embeddings, assessing information from both the instance-level and frame-level perspectives, enabling the model to learn efficiently with limited data.

Abstract

LiDAR-based 3D object detection is a critical technology for the development of autonomous driving and robotics. However, the high cost of data annotation limits its advancement. We propose a novel and effective active learning (AL) method called Distribution Discrepancy and Feature Heterogeneity (DDFH), which simultaneously considers geometric features and model embeddings, assessing information from both the instance-level and frame-level perspectives. Distribution Discrepancy evaluates the difference and novelty of instances within the unlabeled and labeled distributions, enabling the model to learn efficiently with limited data. Feature Heterogeneity ensures the heterogeneity of intra-frame instance features, maintaining feature diversity while avoiding redundant or similar instances, thus minimizing annotation costs. Finally, multiple indicators are efficiently aggregated using Quantile Transform, providing a unified measure of informativeness. Extensive experiments demonstrate that DDFH outperforms the current state-of-the-art (SOTA) methods on the KITTI and Waymo datasets, effectively reducing the bounding box annotation cost by 56.3% and showing robustness when working with both one-stage and two-stage models.
Paper Structure (14 sections, 8 equations, 5 figures, 3 tables)

This paper contains 14 sections, 8 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The three core concepts of DDFH. (a) Embedding and geometric features are used as DDFH inputs. (b) Considering instance-level distribution discrepancy and frame-level feature heterogeneity ensures that instances remain highly informative across all levels. (c) After transforming various indicators using the Quantile Transform, it effectively aggregates to estimate the final informativeness.
  • Figure 2: DDFH framework for LiDAR-based 3D active object detection. According to the batch active learningsettles2009active setup, one cycle represents a single sampling. DDFH utilizes a Quantile Transform to normalize all metrics before aggregation to estimate informativenes, and then updates the dataset before starting a new round.
  • Figure 3: 3D mAP(%) of DDFH and AL baselines on the KITTI val split with PV-RCNN.
  • Figure 4: (a-b) report 3D APH of various AL methods on different difficulties in the Waymo dataset. (c-d) are experiments on the KITTI dataset. (c) demonstrates the impact on performance when DDFH omits geometric features and replaces confidence balance with label balance. (d) calculates the entropy of the number of samples selected for each class to compare the effectiveness of different AL methods in balancing annotation costs.
  • Figure 5: 3D mAP(%) of DDFH and the AL Baseline across various categories on the KITTI dataset at the moderate difficulty with SECOND.