Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching

Boxuan Zhang; Zengmao Wang; Bo Du

Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching

Boxuan Zhang, Zengmao Wang, Bo Du

TL;DR

This work tackles the challenge of scarce object-level annotations in remote sensing images by uniting semi-supervised learning with active learning in a teacher–student framework (SSOD-AT). It introduces the RoI Comparison Module (RoICM) to generate reliable pseudo-labels from consistent teacher–student RoIs and to identify uncertain images for labeling, while a Global Class Prototype enforces diversity among selected samples. Key mechanisms include weighting the SSL loss with the KL divergence between teacher and student predictions using $\mathcal{L}_{det}^{ts} = \mathcal{L}_{det}^{sup} + \exp(-D_{kl}) \cdot \lambda_u \cdot \mathcal{L}_{det}^{unsup}$, updating class prototypes via EMA $g_k = \alpha g_k + (1-\alpha) v_k$, and combining uncertainty and diversity into a final sample score $S_{sel}^{i} = \sqrt[p]{(S_{unc}^{i})^{p} + (S_{div}^{i})^{p}}$. Evaluations on DOTA and DIOR show SSOD-AT outperforms state-of-the-art methods with about 1% gains across active-learning steps, indicating enhanced annotation efficiency and improved remote-sensing object detection.

Abstract

The lack of object-level annotations poses a significant challenge for object detection in remote sensing images (RSIs). To address this issue, active learning (AL) and semi-supervised learning (SSL) techniques have been proposed to enhance the quality and quantity of annotations. AL focuses on selecting the most informative samples for annotation, while SSL leverages the knowledge from unlabeled samples. In this letter, we propose a novel AL method to boost semi-supervised object detection (SSOD) for remote sensing images with a teacher student network, called SSOD-AT. The proposed method incorporates an RoI comparison module (RoICM) to generate high-confidence pseudo-labels for regions of interest (RoIs). Meanwhile, the RoICM is utilized to identify the top-K uncertain images. To reduce redundancy in the top-K uncertain images for human labeling, a diversity criterion is introduced based on object-level prototypes of different categories using both labeled and pseudo-labeled images. Extensive experiments on DOTA and DIOR, two popular datasets, demonstrate that our proposed method outperforms state-of-the-art methods for object detection in RSIs. Compared with the best performance in the SOTA methods, the proposed method achieves 1 percent improvement in most cases in the whole AL.

Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching

TL;DR

, updating class prototypes via EMA

, and combining uncertainty and diversity into a final sample score

. Evaluations on DOTA and DIOR show SSOD-AT outperforms state-of-the-art methods with about 1% gains across active-learning steps, indicating enhanced annotation efficiency and improved remote-sensing object detection.

Abstract

Paper Structure (11 sections, 10 equations, 3 figures, 4 tables)

This paper contains 11 sections, 10 equations, 3 figures, 4 tables.

Introduction
Methodology
RoI Comparison Module(RoICM) for Uncertainty selection
Global Class Prototype for Diversity selection
Experiment And Analysis
Datasets and Experiments Designation
Experimental Results and Analysis
Compare With State-of-the-Art Methods
Ablation Study
Visualization Analysis
Conclusion

Figures (3)

Figure 1: Overview of SSOD-AT framework for three stages. Semi-Supervised Object Detection(SSOD): Using limited label set to initialize the parameters of Teacher-Student framework. Active Learning(AL): Select the top-N valuable samples for labeling. Label set Augmentation(Oracle): Using the active selected samples to augment the label set. Repeat the preceding procedures to train the Teacher-Student framework.
Figure 2: Detection results of the different algorithms on the two remote-sensing datasets. (a) DOTA. (b) DIOR
Figure 3: Visualization of the images with top rank selection priority with different active sampling strategies. The Prediction columns denotes the pseudo-labels predicted by teacher network with 30%(DOTA) and 20%(DIOR) labeled proportions, while the GT columns refers to the corresponding ground-truths.

Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching

TL;DR

Abstract

Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching

Authors

TL;DR

Abstract

Table of Contents

Figures (3)