Domain Adversarial Active Learning for Domain Generalization Classification
Jianting Chen, Ling Ding, Yunxiao Yang, Zaiyuan Di, Yang Xiang
TL;DR
DAAL addresses domain generalization under labeling constraints by combining domain-adversarial sample selection with feature-subset optimization. It defines a sampling score that favors cross-domain challenging samples via centroid-based distances and a loss that emphasizes weakly discriminative domain-specific features, jointly maximizing within-domain inter-class separation and cross-domain invariances. Across benchmarks (PACS, VLCS, Digits, rotated MNIST), DAAL achieves strong generalization with fewer labels and often outperforms both domain-generalization baselines and traditional active-learning approaches, with ablations confirming the contribution of each component. The approach offers practical gains in settings with limited annotation budgets and diverse domain shifts, by reducing labeling costs while maintaining robust cross-domain performance.
Abstract
Domain generalization models aim to learn cross-domain knowledge from source domain data, to improve performance on unknown target domains. Recent research has demonstrated that diverse and rich source domain samples can enhance domain generalization capability. This paper argues that the impact of each sample on the model's generalization ability varies. Despite its small scale, a high-quality dataset can still attain a certain level of generalization ability. Motivated by this, we propose a domain-adversarial active learning (DAAL) algorithm for classification tasks in domain generalization. First, we analyze that the objective of tasks is to maximize the inter-class distance within the same domain and minimize the intra-class distance across different domains. To achieve this objective, we design a domain adversarial selection method that prioritizes challenging samples. Second, we posit that even in a converged model, there are subsets of features that lack discriminatory power within each domain. We attempt to identify these feature subsets and optimize them by a constraint loss. We validate and analyze our DAAL algorithm on multiple domain generalization datasets, comparing it with various domain generalization algorithms and active learning algorithms. Our results demonstrate that the DAAL algorithm can achieve strong generalization ability with fewer data resources, thereby reducing data annotation costs in domain generalization tasks.
