Table of Contents
Fetching ...

Collaboration of Teachers for Semi-supervised Object Detection

Liyu Chen, Huaao Tang, Yi Wen, Hanting Chen, Wei Li, Junchao Liu, Jie Hu

TL;DR

This work tackles teacher–student coupling and confirmation bias in semi-supervised object detection by introducing the Collaboration of Teachers Framework (CTF) with multiple decoupled teacher–student pairs and a Data Performance Consistency Optimization (DPCO) module. The burn-in phase yields diverse teacher perspectives while the two-stage training uses DPCO to identify the best teacher based on accumulative labeled loss, guiding other pairs via reliable pseudo-labels with L_total = L_l + \lambda_u L_u + \beta L_{DPC} and $W_t = (1-\alpha)W_t + \alpha W_s$, where $\beta=2$. Empirically, CT F with DPCO significantly improves mAP on COCO-PARTIAL and VOC-PARTIAL benchmarks (e.g., up to +0.89 mAP over baselines) and converges faster than prior SSOD methods, while remaining plug-and-play with existing approaches. The approach reduces confirmation bias, enhances unlabeled-data utilization, and offers a scalable, generalizable framework for advancing SSOD beyond EMA-based single-teacher paradigms.

Abstract

Recent semi-supervised object detection (SSOD) has achieved remarkable progress by leveraging unlabeled data for training. Mainstream SSOD methods rely on Consistency Regularization methods and Exponential Moving Average (EMA), which form a cyclic data flow. However, the EMA updating training approach leads to weight coupling between the teacher and student models. This coupling in a cyclic data flow results in a decrease in the utilization of unlabeled data information and the confirmation bias on low-quality or erroneous pseudo-labels. To address these issues, we propose the Collaboration of Teachers Framework (CTF), which consists of multiple pairs of teacher and student models for training. In the learning process of CTF, the Data Performance Consistency Optimization module (DPCO) informs the best pair of teacher models possessing the optimal pseudo-labels during the past training process, and these most reliable pseudo-labels generated by the best performing teacher would guide the other student models. As a consequence, this framework greatly improves the utilization of unlabeled data and prevents the positive feedback cycle of unreliable pseudo-labels. The CTF achieves outstanding results on numerous SSOD datasets, including a 0.71% mAP improvement on the 10% annotated COCO dataset and a 0.89% mAP improvement on the VOC dataset compared to LabelMatch and converges significantly faster. Moreover, the CTF is plug-and-play and can be integrated with other mainstream SSOD methods.

Collaboration of Teachers for Semi-supervised Object Detection

TL;DR

This work tackles teacher–student coupling and confirmation bias in semi-supervised object detection by introducing the Collaboration of Teachers Framework (CTF) with multiple decoupled teacher–student pairs and a Data Performance Consistency Optimization (DPCO) module. The burn-in phase yields diverse teacher perspectives while the two-stage training uses DPCO to identify the best teacher based on accumulative labeled loss, guiding other pairs via reliable pseudo-labels with L_total = L_l + \lambda_u L_u + \beta L_{DPC} and , where . Empirically, CT F with DPCO significantly improves mAP on COCO-PARTIAL and VOC-PARTIAL benchmarks (e.g., up to +0.89 mAP over baselines) and converges faster than prior SSOD methods, while remaining plug-and-play with existing approaches. The approach reduces confirmation bias, enhances unlabeled-data utilization, and offers a scalable, generalizable framework for advancing SSOD beyond EMA-based single-teacher paradigms.

Abstract

Recent semi-supervised object detection (SSOD) has achieved remarkable progress by leveraging unlabeled data for training. Mainstream SSOD methods rely on Consistency Regularization methods and Exponential Moving Average (EMA), which form a cyclic data flow. However, the EMA updating training approach leads to weight coupling between the teacher and student models. This coupling in a cyclic data flow results in a decrease in the utilization of unlabeled data information and the confirmation bias on low-quality or erroneous pseudo-labels. To address these issues, we propose the Collaboration of Teachers Framework (CTF), which consists of multiple pairs of teacher and student models for training. In the learning process of CTF, the Data Performance Consistency Optimization module (DPCO) informs the best pair of teacher models possessing the optimal pseudo-labels during the past training process, and these most reliable pseudo-labels generated by the best performing teacher would guide the other student models. As a consequence, this framework greatly improves the utilization of unlabeled data and prevents the positive feedback cycle of unreliable pseudo-labels. The CTF achieves outstanding results on numerous SSOD datasets, including a 0.71% mAP improvement on the 10% annotated COCO dataset and a 0.89% mAP improvement on the VOC dataset compared to LabelMatch and converges significantly faster. Moreover, the CTF is plug-and-play and can be integrated with other mainstream SSOD methods.
Paper Structure (19 sections, 9 equations, 10 figures, 4 tables, 1 algorithm)

This paper contains 19 sections, 9 equations, 10 figures, 4 tables, 1 algorithm.

Figures (10)

  • Figure 1: Figure shows different methods of data flow. (a) The mainstream SSOD method. (b) Differently initialized students with stabilization constraint.ke2019dual (c) Two pairs of teacher and student forming a cyclic data flow.liu2022cycle (d) Multiple pairs of teacher and student forming a cross data flow.
  • Figure 2: It shows the value of Teacher Model 1 mAP minus Teacher Model 2 mAP for each images. Model 1 consistently outperforms Model 2 significantly above orange line. Conversely, Model 2 exhibits better performance below the green line.
  • Figure 3: The relationship between model weight distances and performance. The dotted line represents LabelMatchchen2022label, while the solid line stands for our approach.
  • Figure 4: An overview of the Collaboration of Teachers Framework (number of pairs=2). Before training starts, the teacher and student within the same pair have the same weights, while the models between different pairs have different weights. During training, the DPCO module calculates the accumulative labeled loss to select the most reliable pseudo-labels, allowing them to propagate among different pairs.
  • Figure 5: Comparison between the LabelMatchchen2022label method and ours during the training process. This experiment was conducted using the 10% COCO-PARTIAL setting.
  • ...and 5 more figures