Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection

Jiangyi Wang; Na Zhao

Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection

Jiangyi Wang, Na Zhao

TL;DR

The paper tackles the high annotation cost of indoor 3D object detection by designing an active learning framework that jointly optimizes uncertainty and diversity. It introduces a two-pronged epistemic-uncertainty estimator that handles both inaccurate detections and undetected objects, with a localization-aware score and an undetection count predictor, unified via normalized product scoring. For diversity, it proposes a Class-aware Adaptive Prototype (CAP) bank that dynamically allocates per-class prototypes to capture intra-class variance and scene-type distribution, and selects diverse samples by solving a prototype-histogram optimization using a partitioned, greedy approach. Evaluated on SUN RGB-D and ScanNetV2 with CAGroup3D, the method delivers substantial improvements over baselines and achieves over 85% of fully-supervised performance with only 10% of annotations. This work significantly reduces labeling effort for indoor 3D perception and provides a scalable, uncertainty- and diversity-driven approach adaptable to other indoor sensing tasks.

Abstract

Active learning has emerged as a promising approach to reduce the substantial annotation burden in 3D object detection tasks, spurring several initiatives in outdoor environments. However, its application in indoor environments remains unexplored. Compared to outdoor 3D datasets, indoor datasets face significant challenges, including fewer training samples per class, a greater number of classes, more severe class imbalance, and more diverse scene types and intra-class variances. This paper presents the first study on active learning for indoor 3D object detection, where we propose a novel framework tailored for this task. Our method incorporates two key criteria - uncertainty and diversity - to actively select the most ambiguous and informative unlabeled samples for annotation. The uncertainty criterion accounts for both inaccurate detections and undetected objects, ensuring that the most ambiguous samples are prioritized. Meanwhile, the diversity criterion is formulated as a joint optimization problem that maximizes the diversity of both object class distributions and scene types, using a new Class-aware Adaptive Prototype (CAP) bank. The CAP bank dynamically allocates representative prototypes to each class, helping to capture varying intra-class diversity across different categories. We evaluate our method on SUN RGB-D and ScanNetV2, where it outperforms baselines by a significant margin, achieving over 85% of fully-supervised performance with just 10% of the annotation budget.

Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection

TL;DR

Abstract

Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)