Table of Contents
Fetching ...

Differential Alignment for Domain Adaptive Object Detection

Xinyu He, Xinhui Li, Xiaojie Guo

TL;DR

This work tackles domain adaptive object detection by shifting from uniform to differential feature alignment. It introduces two modules: PDFA, which weights instance-level alignment by prediction discrepancies between a teacher and student, and UFOA, which guides image-level alignment to emphasize foreground regions using a foreground/background mask and an uncertainty-based balance. Integrated into an adaptive teacher–student framework with image- and instance-level discriminators, the method combines supervised, unsupervised, and adversarial losses to maximize domain-invariant detection performance. Empirical results on Cityscapes→Foggy Cityscapes, Sim10k→Cityscapes, and Cityscapes→BDD100K show strong improvements over state-of-the-art methods, validating the effectiveness and robustness of differential alignment in DAOD.

Abstract

Domain adaptive object detection (DAOD) aims to generalize an object detector trained on labeled source-domain data to a target domain without annotations, the core principle of which is \emph{source-target feature alignment}. Typically, existing approaches employ adversarial learning to align the distributions of the source and target domains as a whole, barely considering the varying significance of distinct regions, say instances under different circumstances and foreground \emph{vs} background areas, during feature alignment. To overcome the shortcoming, we investigates a differential feature alignment strategy. Specifically, a prediction-discrepancy feedback instance alignment module (dubbed PDFA) is designed to adaptively assign higher weights to instances of higher teacher-student detection discrepancy, effectively handling heavier domain-specific information. Additionally, an uncertainty-based foreground-oriented image alignment module (UFOA) is proposed to explicitly guide the model to focus more on regions of interest. Extensive experiments on widely-used DAOD datasets together with ablation studies are conducted to demonstrate the efficacy of our proposed method and reveal its superiority over other SOTA alternatives. Our code is available at https://github.com/EstrellaXyu/Differential-Alignment-for-DAOD.

Differential Alignment for Domain Adaptive Object Detection

TL;DR

This work tackles domain adaptive object detection by shifting from uniform to differential feature alignment. It introduces two modules: PDFA, which weights instance-level alignment by prediction discrepancies between a teacher and student, and UFOA, which guides image-level alignment to emphasize foreground regions using a foreground/background mask and an uncertainty-based balance. Integrated into an adaptive teacher–student framework with image- and instance-level discriminators, the method combines supervised, unsupervised, and adversarial losses to maximize domain-invariant detection performance. Empirical results on Cityscapes→Foggy Cityscapes, Sim10k→Cityscapes, and Cityscapes→BDD100K show strong improvements over state-of-the-art methods, validating the effectiveness and robustness of differential alignment in DAOD.

Abstract

Domain adaptive object detection (DAOD) aims to generalize an object detector trained on labeled source-domain data to a target domain without annotations, the core principle of which is \emph{source-target feature alignment}. Typically, existing approaches employ adversarial learning to align the distributions of the source and target domains as a whole, barely considering the varying significance of distinct regions, say instances under different circumstances and foreground \emph{vs} background areas, during feature alignment. To overcome the shortcoming, we investigates a differential feature alignment strategy. Specifically, a prediction-discrepancy feedback instance alignment module (dubbed PDFA) is designed to adaptively assign higher weights to instances of higher teacher-student detection discrepancy, effectively handling heavier domain-specific information. Additionally, an uncertainty-based foreground-oriented image alignment module (UFOA) is proposed to explicitly guide the model to focus more on regions of interest. Extensive experiments on widely-used DAOD datasets together with ablation studies are conducted to demonstrate the efficacy of our proposed method and reveal its superiority over other SOTA alternatives. Our code is available at https://github.com/EstrellaXyu/Differential-Alignment-for-DAOD.

Paper Structure

This paper contains 20 sections, 11 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Different from previous methods adopting equal attention feature alignment (upper part), our design manipulates features from the backbone and ROI head with differential attention (lower part). Different colors represent different alignment weights/attentions.
  • Figure 2: Overview of our method. Our approach is built upon the adaptive teacher-student framework. PDFA adjusts weights to different instances with respect to the discrepancy between predictions of the teacher and the student, while UFOA consists of a mask generator and an image-level discriminator. The mask generator produces a foreground-indicating mask to roughly separate the features of the last stage of the FPN into foreground and background parts.
  • Figure 3: The proposals with top 2% prediction discrepancies are marked in red, while the rest are colored in blue.
  • Figure 4: Visualization of pseudo labels generated by the teacher model. Despite misclassification, mislocalization and false detection errors exist, the union of these inaccurate bounding boxes can still largely indicate foreground areas.
  • Figure 5: Feature distribution visualizations using PCA. Different colors represent different domains.
  • ...and 2 more figures