G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection
Fan Wu, Jinling Gao, Lanqing Hong, Xinbing Wang, Chenghu Zhou, Nanyang Ye
TL;DR
This work tackles Single Domain Generalization Object Detection (S-DGOD) by introducing G-NAS, a Differentiable Neural Architecture Search framework guided by a Generalizable loss ($L_g$) to prevent overfitting to easy, non-causal features. By optimizing both network weights and architectural choices under an OoD-aware objective, G-NAS identifies prediction-head architectures that generalize across unseen weather and lighting domains without access to target-domain data. Empirical results on urban-scene datasets show G-NAS achieves state-of-the-art generalization across multiple target domains, with notable gains on challenging Night and Fog conditions, and robust performance across per-class APs. Ablation studies confirm the contribution of NAS and $L_g$ to improved OoD generalization, and visualization suggests more domain-invariant, causally relevant representations emerge when using G-loss.
Abstract
In this paper, we focus on a realistic yet challenging task, Single Domain Generalization Object Detection (S-DGOD), where only one source domain's data can be used for training object detectors, but have to generalize multiple distinct target domains. In S-DGOD, both high-capacity fitting and generalization abilities are needed due to the task's complexity. Differentiable Neural Architecture Search (NAS) is known for its high capacity for complex data fitting and we propose to leverage Differentiable NAS to solve S-DGOD. However, it may confront severe over-fitting issues due to the feature imbalance phenomenon, where parameters optimized by gradient descent are biased to learn from the easy-to-learn features, which are usually non-causal and spuriously correlated to ground truth labels, such as the features of background in object detection data. Consequently, this leads to serious performance degradation, especially in generalizing to unseen target domains with huge domain gaps between the source domain and target domains. To address this issue, we propose the Generalizable loss (G-loss), which is an OoD-aware objective, preventing NAS from over-fitting by using gradient descent to optimize parameters not only on a subset of easy-to-learn features but also the remaining predictive features for generalization, and the overall framework is named G-NAS. Experimental results on the S-DGOD urban-scene datasets demonstrate that the proposed G-NAS achieves SOTA performance compared to baseline methods. Codes are available at https://github.com/wufan-cse/G-NAS.
