Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection

Moussa Kassem Sbeyti; Nadja Klein; Azarm Nowzad; Fikret Sivrikaya; Sahin Albayrak

Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection

Moussa Kassem Sbeyti, Nadja Klein, Azarm Nowzad, Fikret Sivrikaya, Sahin Albayrak

TL;DR

Real-world semi-supervised object detection struggles due to class imbalance, label noise, and missing pseudo-labels. The authors introduce four data-centric building blocks—RCC, RCF, GLC, and PLS—to improve label quality and class balance within a teacher-student SSOD framework, demonstrated with delta_s gating and lightweight integration. Experiments on KITTI and BDD100K show that pseudo-label quality matters more than quantity, and that combining RCC/RCF with GLC and PLS yields substantial gains (up to 6% in SSOD and up to 21% when combined), highlighting practical benefits for autonomous-driving scenarios. The work provides a practical, model- and framework-agnostic toolkit to robustify SSOD in real-world conditions and outlines directions for extending these ideas to other domains and detectors.

Abstract

Semi-supervised object detection (SSOD) based on pseudo-labeling significantly reduces dependence on large labeled datasets by effectively leveraging both labeled and unlabeled data. However, real-world applications of SSOD often face critical challenges, including class imbalance, label noise, and labeling errors. We present an in-depth analysis of SSOD under real-world conditions, uncovering causes of suboptimal pseudo-labeling and key trade-offs between label quality and quantity. Based on our findings, we propose four building blocks that can be seamlessly integrated into an SSOD framework. Rare Class Collage (RCC): a data augmentation method that enhances the representation of rare classes by creating collages of rare objects. Rare Class Focus (RCF): a stratified batch sampling strategy that ensures a more balanced representation of all classes during training. Ground Truth Label Correction (GLC): a label refinement method that identifies and corrects false, missing, and noisy ground truth labels by leveraging the consistency of teacher model predictions. Pseudo-Label Selection (PLS): a selection method for removing low-quality pseudo-labeled images, guided by a novel metric estimating the missing detection rate while accounting for class rarity. We validate our methods through comprehensive experiments on autonomous driving datasets, resulting in up to 6% increase in SSOD performance. Overall, our investigation and novel, data-centric, and broadly applicable building blocks enable robust and effective SSOD in complex, real-world scenarios. Code is available at https://mos-ks.github.io/publications.

Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection

TL;DR

Abstract

Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)