Table of Contents
Fetching ...

Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer

Tahira Shehzadi, Ifza, Didier Stricker, Muhammad Zeshan Afzal

TL;DR

This survey reviews the rapid progress of Semi-Supervised Object Detection (SSOD) from CNN-based methods to Transformer-based approaches, focusing on leveraging both labeled and unlabeled data to improve detection. It categorizes techniques into data augmentation, pseudo-labeling, consistency regularization, and adversarial training, and surveys 27 representative methods. It also analyzes datasets like COCO and VOC, compares CNN-based and Transformer-based SSOD models, and discusses open challenges and future directions. The work aims to guide future research and practical deployment in domains with limited annotations.

Abstract

The impressive advancements in semi-supervised learning have driven researchers to explore its potential in object detection tasks within the field of computer vision. Semi-Supervised Object Detection (SSOD) leverages a combination of a small labeled dataset and a larger, unlabeled dataset. This approach effectively reduces the dependence on large labeled datasets, which are often expensive and time-consuming to obtain. Initially, SSOD models encountered challenges in effectively leveraging unlabeled data and managing noise in generated pseudo-labels for unlabeled data. However, numerous recent advancements have addressed these issues, resulting in substantial improvements in SSOD performance. This paper presents a comprehensive review of 27 cutting-edge developments in SSOD methodologies, from Convolutional Neural Networks (CNNs) to Transformers. We delve into the core components of semi-supervised learning and its integration into object detection frameworks, covering data augmentation techniques, pseudo-labeling strategies, consistency regularization, and adversarial training methods. Furthermore, we conduct a comparative analysis of various SSOD models, evaluating their performance and architectural differences. We aim to ignite further research interest in overcoming existing challenges and exploring new directions in semi-supervised learning for object detection.

Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer

TL;DR

This survey reviews the rapid progress of Semi-Supervised Object Detection (SSOD) from CNN-based methods to Transformer-based approaches, focusing on leveraging both labeled and unlabeled data to improve detection. It categorizes techniques into data augmentation, pseudo-labeling, consistency regularization, and adversarial training, and surveys 27 representative methods. It also analyzes datasets like COCO and VOC, compares CNN-based and Transformer-based SSOD models, and discusses open challenges and future directions. The work aims to guide future research and practical deployment in domains with limited annotations.

Abstract

The impressive advancements in semi-supervised learning have driven researchers to explore its potential in object detection tasks within the field of computer vision. Semi-Supervised Object Detection (SSOD) leverages a combination of a small labeled dataset and a larger, unlabeled dataset. This approach effectively reduces the dependence on large labeled datasets, which are often expensive and time-consuming to obtain. Initially, SSOD models encountered challenges in effectively leveraging unlabeled data and managing noise in generated pseudo-labels for unlabeled data. However, numerous recent advancements have addressed these issues, resulting in substantial improvements in SSOD performance. This paper presents a comprehensive review of 27 cutting-edge developments in SSOD methodologies, from Convolutional Neural Networks (CNNs) to Transformers. We delve into the core components of semi-supervised learning and its integration into object detection frameworks, covering data augmentation techniques, pseudo-labeling strategies, consistency regularization, and adversarial training methods. Furthermore, we conduct a comparative analysis of various SSOD models, evaluating their performance and architectural differences. We aim to ignite further research interest in overcoming existing challenges and exploring new directions in semi-supervised learning for object detection.
Paper Structure (59 sections, 31 figures, 4 tables)

This paper contains 59 sections, 31 figures, 4 tables.

Figures (31)

  • Figure 1: Semi-Supervised Object Detection: A Comprehensive Review and Taxonomy of Techniques.
  • Figure 2: Teacher-Student Architecture for Semi-Supervised Object Detection
  • Figure 3: Framework of One Teacher OneT96
  • Figure 4: Framework of DSL DSL_CVPR_22
  • Figure 5: Framework of Dense Teacher DenseTeacher
  • ...and 26 more figures