Table of Contents
Fetching ...

Enhancing cross-domain detection: adaptive class-aware contrastive transformer

Ziru Zeng, Yue Ding, Hongtao Lu

TL;DR

This work tackles cross-domain object detection with transformer-based detectors under target-domain label scarcity. It introduces an adaptive class-aware contrastive transformer (ACCT) framework that integrates IoU-guided pseudo-label refinement, per-class adaptive thresholds via Gaussian Mixture Models, and an instance-level class-aware contrastive loss within a mean-teacher and adversarial learning setup. The approach yields improved detection performance across weather, synthetic-to-real, and scene adaptation benchmarks, especially for minority classes, by stabilizing pseudo-label quality and promoting discriminative features. The work offers a practical pathway to robust, post-processing-free cross-domain detection with transformer architectures.

Abstract

Recently,the detection transformer has gained substantial attention for its inherent minimal post-processing requirement.However,this paradigm relies on abundant training data,yet in the context of the cross-domain adaptation,insufficient labels in the target domain exacerbate issues of class imbalance and model performance degradation.To address these challenges, we propose a novel class-aware cross domain detection transformer based on the adversarial learning and mean-teacher framework.First,considering the inconsistencies between the classification and regression tasks,we introduce an IoU-aware prediction branch and exploit the consistency of classification and location scores to filter and reweight pseudo labels.Second, we devise a dynamic category threshold refinement to adaptively manage model confidence.Third,to alleviate the class imbalance,an instance-level class-aware contrastive learning module is presented to encourage the generation of discriminative features for each class,particularly benefiting minority classes.Experimental results across diverse domain-adaptive scenarios validate our method's effectiveness in improving performance and alleviating class imbalance issues,which outperforms the state-of-the-art transformer based methods.

Enhancing cross-domain detection: adaptive class-aware contrastive transformer

TL;DR

This work tackles cross-domain object detection with transformer-based detectors under target-domain label scarcity. It introduces an adaptive class-aware contrastive transformer (ACCT) framework that integrates IoU-guided pseudo-label refinement, per-class adaptive thresholds via Gaussian Mixture Models, and an instance-level class-aware contrastive loss within a mean-teacher and adversarial learning setup. The approach yields improved detection performance across weather, synthetic-to-real, and scene adaptation benchmarks, especially for minority classes, by stabilizing pseudo-label quality and promoting discriminative features. The work offers a practical pathway to robust, post-processing-free cross-domain detection with transformer architectures.

Abstract

Recently,the detection transformer has gained substantial attention for its inherent minimal post-processing requirement.However,this paradigm relies on abundant training data,yet in the context of the cross-domain adaptation,insufficient labels in the target domain exacerbate issues of class imbalance and model performance degradation.To address these challenges, we propose a novel class-aware cross domain detection transformer based on the adversarial learning and mean-teacher framework.First,considering the inconsistencies between the classification and regression tasks,we introduce an IoU-aware prediction branch and exploit the consistency of classification and location scores to filter and reweight pseudo labels.Second, we devise a dynamic category threshold refinement to adaptively manage model confidence.Third,to alleviate the class imbalance,an instance-level class-aware contrastive learning module is presented to encourage the generation of discriminative features for each class,particularly benefiting minority classes.Experimental results across diverse domain-adaptive scenarios validate our method's effectiveness in improving performance and alleviating class imbalance issues,which outperforms the state-of-the-art transformer based methods.
Paper Structure (12 sections, 12 equations, 4 figures, 4 tables)

This paper contains 12 sections, 12 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: We quantify the count of the distinct objects in the foggy cityscapes training set and display the ratio between pseudo labels and ground truth under our adaptive class-aware threshold and unified static threshold respectively.
  • Figure 2: Overview of the proposed framework. In burn-up stage, we freeze the teacher model's dataflow and conduct adversarial feature learning to train student model. In mutual-learning stage, the teacher model is activated to generate pseudo labels for weak-aug target domain images. We use GMM model with combined confidence to obtain the class threshold and filter the boxes. Then we exploit the ROIAlign to extract the feature of filtered boxes and compute the contrastive loss via contrastive learning module.
  • Figure 3: Visualizations on the weather adaptation of different methods based transformer
  • Figure 4: Visualization of foggy cityscapes val dataset in t-sne