Unified Unsupervised and Sparsely-Supervised 3D Object Detection by Semantic Pseudo-Labeling and Prototype Learning

Yushen He

Unified Unsupervised and Sparsely-Supervised 3D Object Detection by Semantic Pseudo-Labeling and Prototype Learning

Yushen He

Abstract

3D object detection is essential for autonomous driving and robotic perception, yet its reliance on large-scale manually annotated data limits scalability and adaptability. To reduce annotation dependency, unsupervised and sparsely-supervised paradigms have emerged. However, they face intertwined challenges: low-quality pseudo-labels, unstable feature mining, and a lack of a unified training framework. This paper proposes SPL, a unified training framework for both Unsupervised and Sparsely-Supervised 3D Object Detection via Semantic Pseudo-labeling and prototype Learning. SPL first generates high-quality pseudo-labels by integrating image semantics, point cloud geometry, and temporal cues, producing both 3D bounding boxes for dense objects and 3D point labels for sparse ones. These pseudo-labels are not used directly but as probabilistic priors within a novel, multi-stage prototype learning strategy. This strategy stabilizes feature representation learning through memory-based initialization and momentum-based prototype updating, effectively mining features from both labeled and unlabeled data. Extensive experiments on KITTI and nuScenes datasets demonstrate that SPL significantly outperforms state-of-the-art methods in both settings. Our work provides a robust and generalizable solution for learning 3D object detectors with minimal or no manual annotations.

Unified Unsupervised and Sparsely-Supervised 3D Object Detection by Semantic Pseudo-Labeling and Prototype Learning

Abstract

Paper Structure (34 sections, 15 equations, 6 figures, 7 tables)

This paper contains 34 sections, 15 equations, 6 figures, 7 tables.

Introduction
Related Work
Fully-Supervised 3D Object Detection
Unsupervised 3D Object Detection
Sparsely-Supervised 3D Object Detection
Prototype-based Methods
Method
3D Pseudo Label Generation
Data Preprocessing:
3D Points Labels Generation:
3D Bbox Labels Generation:
Prototype-Based Training Strategy
Labels Processing:
Feature Mining:
Loss Function:
...and 19 more sections

Figures (6)

Figure 1: Conparison between existing unsupervised and sparsely-supervised 3D object detection methods and our proposed SPL framework.
Figure 2: Different contrastive learning strategies in sparsely-supervised 3D object detection.
Figure 3: Overview of the 3D pseudo label generation process.
Figure 4: Overview of the prototype-based training strategy.
Figure 5: Feature mining process combining prototype similarity and pseudo heatmap.
...and 1 more figures

Unified Unsupervised and Sparsely-Supervised 3D Object Detection by Semantic Pseudo-Labeling and Prototype Learning

Abstract

Unified Unsupervised and Sparsely-Supervised 3D Object Detection by Semantic Pseudo-Labeling and Prototype Learning

Authors

Abstract

Table of Contents

Figures (6)