Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection
Yajing Liu, Shijun Zhou, Xiyao Liu, Chunhui Hao, Baojie Fan, Jiandong Tian
TL;DR
This work tackles single-source domain generalization for object detection by modeling data and feature biases through a Structural Causal Model and engineering a causal learning framework. The Unbiased Faster R-CNN (UFR) combines a Global-Local Transformation for data augmentation with a Causal Attention Learning module and a Causal Prototype Learning module to encourage image- and object-level causal representations. Empirical results across five weather conditions demonstrate improved generalization, notably a $3.9$ percentage-point $mAP$ gain on the Night-Clear scene, outperforming domain-invariant and augmentation-based baselines. The approach offers a causal, feature-level mechanism to robustly detect objects under distribution shift, with potential for more reliable real-world perception systems.
Abstract
Single-source domain generalization (SDG) for object detection is a challenging yet essential task as the distribution bias of the unseen domain degrades the algorithm performance significantly. However, existing methods attempt to extract domain-invariant features, neglecting that the biased data leads the network to learn biased features that are non-causal and poorly generalizable. To this end, we propose an Unbiased Faster R-CNN (UFR) for generalizable feature learning. Specifically, we formulate SDG in object detection from a causal perspective and construct a Structural Causal Model (SCM) to analyze the data bias and feature bias in the task, which are caused by scene confounders and object attribute confounders. Based on the SCM, we design a Global-Local Transformation module for data augmentation, which effectively simulates domain diversity and mitigates the data bias. Additionally, we introduce a Causal Attention Learning module that incorporates a designed attention invariance loss to learn image-level features that are robust to scene confounders. Moreover, we develop a Causal Prototype Learning module with an explicit instance constraint and an implicit prototype constraint, which further alleviates the negative impact of object attribute confounders. Experimental results on five scenes demonstrate the prominent generalization ability of our method, with an improvement of 3.9% mAP on the Night-Clear scene.
