Table of Contents
Fetching ...

DSD-DA: Distillation-based Source Debiasing for Domain Adaptive Object Detection

Yongchao Feng, Shiwei Li, Yingjie Gao, Ziyue Huang, Yanan Zhang, Qingjie Liu, Yunhong Wang

TL;DR

Domain Adaptive Object Detection methods often overfit to source data, limiting transfer to the target domain due to source bias and misalignment between classification and localization. The authors propose DSD-DA, combining Distillation-based Source Debiasing (DSD) with a Target-Relevant Object Localization Network (TROLN) and a Domain-aware Consistency Enhancement (DCE) to distill domain-agnostic knowledge and harmonize predictions. A classification-teacher learns domain-agnostic features from mix-style data, while TROLN emphasizes target-relevant localization; distillation transfers this knowledge to the detector, and DCE refines classification scores at test time using a localization score that fuses centerness, IoU, and target affinities. Across Cityscapes-FoggyCityscapes, KITTI-Cityscapes, and SIM10k-Cityscapes, DSD-DA yields consistent, substantial gains over strong baselines and prior alignment-based methods, demonstrating improved cross-domain robustness and reduced source bias.

Abstract

Though feature-alignment based Domain Adaptive Object Detection (DAOD) methods have achieved remarkable progress, they ignore the source bias issue, i.e., the detector tends to acquire more source-specific knowledge, impeding its generalization capabilities in the target domain. Furthermore, these methods face a more formidable challenge in achieving consistent classification and localization in the target domain compared to the source domain. To overcome these challenges, we propose a novel Distillation-based Source Debiasing (DSD) framework for DAOD, which can distill domain-agnostic knowledge from a pre-trained teacher model, improving the detector's performance on both domains. In addition, we design a Target-Relevant Object Localization Network (TROLN), which can mine target-related localization information from source and target-style mixed data. Accordingly, we present a Domain-aware Consistency Enhancing (DCE) strategy, in which these information are formulated into a new localization representation to further refine classification scores in the testing stage, achieving a harmonization between classification and localization. Extensive experiments have been conducted to manifest the effectiveness of this method, which consistently improves the strong baseline by large margins, outperforming existing alignment-based works.

DSD-DA: Distillation-based Source Debiasing for Domain Adaptive Object Detection

TL;DR

Domain Adaptive Object Detection methods often overfit to source data, limiting transfer to the target domain due to source bias and misalignment between classification and localization. The authors propose DSD-DA, combining Distillation-based Source Debiasing (DSD) with a Target-Relevant Object Localization Network (TROLN) and a Domain-aware Consistency Enhancement (DCE) to distill domain-agnostic knowledge and harmonize predictions. A classification-teacher learns domain-agnostic features from mix-style data, while TROLN emphasizes target-relevant localization; distillation transfers this knowledge to the detector, and DCE refines classification scores at test time using a localization score that fuses centerness, IoU, and target affinities. Across Cityscapes-FoggyCityscapes, KITTI-Cityscapes, and SIM10k-Cityscapes, DSD-DA yields consistent, substantial gains over strong baselines and prior alignment-based methods, demonstrating improved cross-domain robustness and reduced source bias.

Abstract

Though feature-alignment based Domain Adaptive Object Detection (DAOD) methods have achieved remarkable progress, they ignore the source bias issue, i.e., the detector tends to acquire more source-specific knowledge, impeding its generalization capabilities in the target domain. Furthermore, these methods face a more formidable challenge in achieving consistent classification and localization in the target domain compared to the source domain. To overcome these challenges, we propose a novel Distillation-based Source Debiasing (DSD) framework for DAOD, which can distill domain-agnostic knowledge from a pre-trained teacher model, improving the detector's performance on both domains. In addition, we design a Target-Relevant Object Localization Network (TROLN), which can mine target-related localization information from source and target-style mixed data. Accordingly, we present a Domain-aware Consistency Enhancing (DCE) strategy, in which these information are formulated into a new localization representation to further refine classification scores in the testing stage, achieving a harmonization between classification and localization. Extensive experiments have been conducted to manifest the effectiveness of this method, which consistently improves the strong baseline by large margins, outperforming existing alignment-based works.
Paper Structure (26 sections, 13 equations, 8 figures, 13 tables)

This paper contains 26 sections, 13 equations, 8 figures, 13 tables.

Figures (8)

  • Figure 1: In (a) traditional alignment approaches, the detector tends to learn more source-specific knowledge due to the supervised detection loss $L_{det}$, rather than domain-agnostic knowledge (b). In (c) our proposed DSD framework, by introducing a distillation loss to the source data, the model can acquire more domain-agnostic knowledge than (a). The red and green dotted arrows represent the impact of source or target-related losses on knowledge transfer, respectively.
  • Figure 2: Demonstrative cases of the exacerbated inconsistency between classification and localization based on alignment-based detector DA-Faster chen2018domain. The upper row figures (a) and (b) show DA-Faster's detection results on Cityscapes and FoggyCityscapes, respectively. The lower row figures (c) and (d) display the correlation between localization ground-truth ($x$-axis, represented by the the IoU between the bounding box and its matched ground-truth) and classification scores ($y$-axis) for 500 randomly sampled DA-Faster's detected boxes in Cityscapes and FoggyCityscapes, respectively. The bounding boxes are filtered based on an IoU ($\ge 0.5$) with the corresponding ground-truth.
  • Figure 3: Overview of the proposed distillation-based source debiasing framework for DAOD. Part i@ shows the teacher models training stage, which includes a mix-style classifier and a Target-Relevant object localization network (TROLN) training. Part ii@ demonstrates distillation-based source debiasing (DSD) training, in which the cross-domain detector is trained. In Part iii@, the Domain-aware Consistency Enhancement (DCE) strategy is introduced to refine the detector's classification scores in the testing phase, enhancing the consistency between classification and localization.
  • Figure 4: The correlation between localization ground-truth and centerness/IoU/localization scores of the bounding boxes on the target test dataset. The bounding boxes are filtered based on an IoU ($\ge 0.5$) with the corresponding ground-truth.
  • Figure 5: The variety of consistency across different classes before and after classification scores refinement on the FoggyCityscapes test datasets.
  • ...and 3 more figures