Table of Contents
Fetching ...

A Pairwise DomMix Attentive Adversarial Network for Unsupervised Domain Adaptive Object Detection

Jie Shao, Jiacheng Wu, Wenzhong Shen, Cheng Yang

TL;DR

The paper tackles unsupervised domain adaptive object detection under large domain shifts by introducing a pairwise DomMix attentive adversarial network. It constructs an intermediate domain through deep multi-scale image-level feature mixup ( DomMix ) to bridge source and target distributions, and then applies a pairwise attentive module with Residual SimAM and ECAP to refine image- and instance-level features, followed by an Adaptive Pyramid Classifier for adversarial domain discrimination. The overall objective combines detection loss, a consistency regularizer, and a domain loss, optimized adversarially to align cross-domain features. Experiments on VOC→Clipart, VOC→Watercolor, and VOC→Comic demonstrate competitive performance and ablations show contributions from DomMix and the attention modules, highlighting robust cross-domain detection without target labels.

Abstract

Unsupervised Domain Adaptive Object Detection (DAOD) could adapt a model trained on a source domain to an unlabeled target domain for object detection. Existing unsupervised DAOD methods usually perform feature alignments from the target to the source. Unidirectional domain transfer would omit information about the target samples and result in suboptimal adaptation when there are large domain shifts. Therefore, we propose a pairwise attentive adversarial network with a Domain Mixup (DomMix) module to mitigate the aforementioned challenges. Specifically, a deep-level mixup is employed to construct an intermediate domain that allows features from both domains to share their differences. Then a pairwise attentive adversarial network is applied with attentive encoding on both image-level and instance-level features at different scales and optimizes domain alignment by adversarial learning. This allows the network to focus on regions with disparate contextual information and learn their similarities between different domains. Extensive experiments are conducted on several benchmark datasets, demonstrating the superiority of our proposed method.

A Pairwise DomMix Attentive Adversarial Network for Unsupervised Domain Adaptive Object Detection

TL;DR

The paper tackles unsupervised domain adaptive object detection under large domain shifts by introducing a pairwise DomMix attentive adversarial network. It constructs an intermediate domain through deep multi-scale image-level feature mixup ( DomMix ) to bridge source and target distributions, and then applies a pairwise attentive module with Residual SimAM and ECAP to refine image- and instance-level features, followed by an Adaptive Pyramid Classifier for adversarial domain discrimination. The overall objective combines detection loss, a consistency regularizer, and a domain loss, optimized adversarially to align cross-domain features. Experiments on VOC→Clipart, VOC→Watercolor, and VOC→Comic demonstrate competitive performance and ablations show contributions from DomMix and the attention modules, highlighting robust cross-domain detection without target labels.

Abstract

Unsupervised Domain Adaptive Object Detection (DAOD) could adapt a model trained on a source domain to an unlabeled target domain for object detection. Existing unsupervised DAOD methods usually perform feature alignments from the target to the source. Unidirectional domain transfer would omit information about the target samples and result in suboptimal adaptation when there are large domain shifts. Therefore, we propose a pairwise attentive adversarial network with a Domain Mixup (DomMix) module to mitigate the aforementioned challenges. Specifically, a deep-level mixup is employed to construct an intermediate domain that allows features from both domains to share their differences. Then a pairwise attentive adversarial network is applied with attentive encoding on both image-level and instance-level features at different scales and optimizes domain alignment by adversarial learning. This allows the network to focus on regions with disparate contextual information and learn their similarities between different domains. Extensive experiments are conducted on several benchmark datasets, demonstrating the superiority of our proposed method.
Paper Structure (13 sections, 5 equations, 4 figures, 4 tables)

This paper contains 13 sections, 5 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The overview of our method.
  • Figure 2: SimAM (up) and RSA (bottom)
  • Figure 3: Watercolor2k samples before (left) and after (right) domain adaptation.
  • Figure 4: Variation of mAP corresponding to different mixing ratios $\lambda$ in DomMix as the number of training rounds increases (left), and variation of mAP corresponding to different weighting factors $\alpha$ in PAM (right).