Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer
Tahira Shehzadi, Ifza, Didier Stricker, Muhammad Zeshan Afzal
TL;DR
This survey reviews the rapid progress of Semi-Supervised Object Detection (SSOD) from CNN-based methods to Transformer-based approaches, focusing on leveraging both labeled and unlabeled data to improve detection. It categorizes techniques into data augmentation, pseudo-labeling, consistency regularization, and adversarial training, and surveys 27 representative methods. It also analyzes datasets like COCO and VOC, compares CNN-based and Transformer-based SSOD models, and discusses open challenges and future directions. The work aims to guide future research and practical deployment in domains with limited annotations.
Abstract
The impressive advancements in semi-supervised learning have driven researchers to explore its potential in object detection tasks within the field of computer vision. Semi-Supervised Object Detection (SSOD) leverages a combination of a small labeled dataset and a larger, unlabeled dataset. This approach effectively reduces the dependence on large labeled datasets, which are often expensive and time-consuming to obtain. Initially, SSOD models encountered challenges in effectively leveraging unlabeled data and managing noise in generated pseudo-labels for unlabeled data. However, numerous recent advancements have addressed these issues, resulting in substantial improvements in SSOD performance. This paper presents a comprehensive review of 27 cutting-edge developments in SSOD methodologies, from Convolutional Neural Networks (CNNs) to Transformers. We delve into the core components of semi-supervised learning and its integration into object detection frameworks, covering data augmentation techniques, pseudo-labeling strategies, consistency regularization, and adversarial training methods. Furthermore, we conduct a comparative analysis of various SSOD models, evaluating their performance and architectural differences. We aim to ignite further research interest in overcoming existing challenges and exploring new directions in semi-supervised learning for object detection.
