Table of Contents
Fetching ...

Instance-Free Domain Adaptive Object Detection

Hengfu Yu, Jinhong Deng, Lixin Duan, Wen Li

TL;DR

This work tackles Instance-Free Domain Adaptive Object Detection, where target-domain data lacks foreground objects during training. It introduces the Relational and Structural Consistency Network (RSCN), which leverages background prototypes and three losses—Background Prototype Alignment, Relative Space Harmonization, and Source Structure Preservation—to bridge domain gaps without target foregrounds. Across three new benchmarks (IF-CARLA, IF-CCT, IF-LUNA16), RSCN significantly improves cross-domain detection over strong baselines and ablations confirm the value of each component. The approach enables practical deployment of detectors in settings where collecting foreground examples in the target domain is costly or impractical, with broad applicability to driving, wildlife monitoring, and medical imaging.

Abstract

While Domain Adaptive Object Detection (DAOD) has made significant strides, most methods rely on unlabeled target data that is assumed to contain sufficient foreground instances. However, in many practical scenarios (e.g., wildlife monitoring, lesion detection), collecting target domain data with objects of interest is prohibitively costly, whereas background-only data is abundant. This common practical constraint introduces a significant technical challenge: the difficulty of achieving domain alignment when target instances are unavailable, forcing adaptation to rely solely on the target background information. We formulate this challenge as the novel problem of Instance-Free Domain Adaptive Object Detection. To tackle this, we propose the Relational and Structural Consistency Network (RSCN) which pioneers an alignment strategy based on background feature prototypes while simultaneously encouraging consistency in the relationship between the source foreground features and the background features within each domain, enabling robust adaptation even without target instances. To facilitate research, we further curate three specialized benchmarks, including simulative auto-driving detection, wildlife detection, and lung nodule detection. Extensive experiments show that RSCN significantly outperforms existing DAOD methods across all three benchmarks in the instance-free scenario. The code and benchmarks will be released soon.

Instance-Free Domain Adaptive Object Detection

TL;DR

This work tackles Instance-Free Domain Adaptive Object Detection, where target-domain data lacks foreground objects during training. It introduces the Relational and Structural Consistency Network (RSCN), which leverages background prototypes and three losses—Background Prototype Alignment, Relative Space Harmonization, and Source Structure Preservation—to bridge domain gaps without target foregrounds. Across three new benchmarks (IF-CARLA, IF-CCT, IF-LUNA16), RSCN significantly improves cross-domain detection over strong baselines and ablations confirm the value of each component. The approach enables practical deployment of detectors in settings where collecting foreground examples in the target domain is costly or impractical, with broad applicability to driving, wildlife monitoring, and medical imaging.

Abstract

While Domain Adaptive Object Detection (DAOD) has made significant strides, most methods rely on unlabeled target data that is assumed to contain sufficient foreground instances. However, in many practical scenarios (e.g., wildlife monitoring, lesion detection), collecting target domain data with objects of interest is prohibitively costly, whereas background-only data is abundant. This common practical constraint introduces a significant technical challenge: the difficulty of achieving domain alignment when target instances are unavailable, forcing adaptation to rely solely on the target background information. We formulate this challenge as the novel problem of Instance-Free Domain Adaptive Object Detection. To tackle this, we propose the Relational and Structural Consistency Network (RSCN) which pioneers an alignment strategy based on background feature prototypes while simultaneously encouraging consistency in the relationship between the source foreground features and the background features within each domain, enabling robust adaptation even without target instances. To facilitate research, we further curate three specialized benchmarks, including simulative auto-driving detection, wildlife detection, and lung nodule detection. Extensive experiments show that RSCN significantly outperforms existing DAOD methods across all three benchmarks in the instance-free scenario. The code and benchmarks will be released soon.
Paper Structure (18 sections, 7 equations, 12 figures, 7 tables)

This paper contains 18 sections, 7 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: Illustration of Instance-Free DAOD. The orange timeline indicates that target-domain instances appear only after a long waiting period and require costly manual screening. Traditional DAOD (top) relies on target images containing foreground instances, whereas Instance-Free DAOD (bottom) enables transferring using only background-only target images.
  • Figure 2: An overview of the proposed RSCN. For every batch, source-domain images with labeled foreground objects and target-domain images without foreground objects are fed to the detector. The background prototypes are aligned with the BPA objective in Eq. \ref{['eq:bpa']}. RSH in Eq. \ref{['eq:rsh']} keeps the relative geometric relationship for the source and target background prototypes to the shared foreground prototype anchors. A frozen reference detector is utilized to maintain the source-domain structure in order to avoid the feature collapse with SSP in Eq. \ref{['eq:ssp']}.
  • Figure 3: An overview of the proposed three constraints. (a) Background Prototype Alignment (BPA) minimizes the distance between the source and target background features. (b) Relative Space Harmonization (RSH) aligns the relationship between the source-domain foreground feature and background features from the two domains. (c) Source Structure Preservation (SSP) intends to maintain the feature relative structure in the source domain.
  • Figure 4: Examples of the three benchmarks. The IF-CARLA benchmark illustrates a domain shift from daytime to nighttime driving scenes. The IF-CCT benchmark captures a modality shift between visible-light imagery and infrared illumination. The IF-LUNA16 benchmark represents a cross-device transfer scenario, where differences in image layout, noise characteristics, and CT reconstruction protocols create inter-device domain discrepancies. Objects are marked with green bounding boxes.
  • Figure 5: The performance comparison between RSCN and two comparison methods on IF-CARLA. Our method achieves better performance than (a) feature-alignment-based DAF daf and (b) self-training-based AT at baselines.
  • ...and 7 more figures