Table of Contents
Fetching ...

Domain Adversarial Active Learning for Domain Generalization Classification

Jianting Chen, Ling Ding, Yunxiao Yang, Zaiyuan Di, Yang Xiang

TL;DR

DAAL addresses domain generalization under labeling constraints by combining domain-adversarial sample selection with feature-subset optimization. It defines a sampling score that favors cross-domain challenging samples via centroid-based distances and a loss that emphasizes weakly discriminative domain-specific features, jointly maximizing within-domain inter-class separation and cross-domain invariances. Across benchmarks (PACS, VLCS, Digits, rotated MNIST), DAAL achieves strong generalization with fewer labels and often outperforms both domain-generalization baselines and traditional active-learning approaches, with ablations confirming the contribution of each component. The approach offers practical gains in settings with limited annotation budgets and diverse domain shifts, by reducing labeling costs while maintaining robust cross-domain performance.

Abstract

Domain generalization models aim to learn cross-domain knowledge from source domain data, to improve performance on unknown target domains. Recent research has demonstrated that diverse and rich source domain samples can enhance domain generalization capability. This paper argues that the impact of each sample on the model's generalization ability varies. Despite its small scale, a high-quality dataset can still attain a certain level of generalization ability. Motivated by this, we propose a domain-adversarial active learning (DAAL) algorithm for classification tasks in domain generalization. First, we analyze that the objective of tasks is to maximize the inter-class distance within the same domain and minimize the intra-class distance across different domains. To achieve this objective, we design a domain adversarial selection method that prioritizes challenging samples. Second, we posit that even in a converged model, there are subsets of features that lack discriminatory power within each domain. We attempt to identify these feature subsets and optimize them by a constraint loss. We validate and analyze our DAAL algorithm on multiple domain generalization datasets, comparing it with various domain generalization algorithms and active learning algorithms. Our results demonstrate that the DAAL algorithm can achieve strong generalization ability with fewer data resources, thereby reducing data annotation costs in domain generalization tasks.

Domain Adversarial Active Learning for Domain Generalization Classification

TL;DR

DAAL addresses domain generalization under labeling constraints by combining domain-adversarial sample selection with feature-subset optimization. It defines a sampling score that favors cross-domain challenging samples via centroid-based distances and a loss that emphasizes weakly discriminative domain-specific features, jointly maximizing within-domain inter-class separation and cross-domain invariances. Across benchmarks (PACS, VLCS, Digits, rotated MNIST), DAAL achieves strong generalization with fewer labels and often outperforms both domain-generalization baselines and traditional active-learning approaches, with ablations confirming the contribution of each component. The approach offers practical gains in settings with limited annotation budgets and diverse domain shifts, by reducing labeling costs while maintaining robust cross-domain performance.

Abstract

Domain generalization models aim to learn cross-domain knowledge from source domain data, to improve performance on unknown target domains. Recent research has demonstrated that diverse and rich source domain samples can enhance domain generalization capability. This paper argues that the impact of each sample on the model's generalization ability varies. Despite its small scale, a high-quality dataset can still attain a certain level of generalization ability. Motivated by this, we propose a domain-adversarial active learning (DAAL) algorithm for classification tasks in domain generalization. First, we analyze that the objective of tasks is to maximize the inter-class distance within the same domain and minimize the intra-class distance across different domains. To achieve this objective, we design a domain adversarial selection method that prioritizes challenging samples. Second, we posit that even in a converged model, there are subsets of features that lack discriminatory power within each domain. We attempt to identify these feature subsets and optimize them by a constraint loss. We validate and analyze our DAAL algorithm on multiple domain generalization datasets, comparing it with various domain generalization algorithms and active learning algorithms. Our results demonstrate that the DAAL algorithm can achieve strong generalization ability with fewer data resources, thereby reducing data annotation costs in domain generalization tasks.
Paper Structure (15 sections, 11 equations, 10 figures, 6 tables)

This paper contains 15 sections, 11 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: The domain generalization experiment results on the PACS dataset were obtained by randomly selecting training subsets. We conducted the experiment using four different sizes of training subsets - 25%, 50%, 75%, and 100%. The ERM models were trained accordingly.
  • Figure 2: The iterative framework of the domain adversarial active learning algorithm.
  • Figure 3: The illustration of intra-class distance within the same domain and intra-class distance across different domains.
  • Figure 4: The illustration of domain adversarial samples.
  • Figure 5: The potential feature distributions after training with two domain samples. In the left graph, the square domain establishes a decision boundary based on the feature $f^1$. The triangle domain establishes a decision boundary with the other feature $f^2$. In the right graph, two domains that have the flexibility to rely on either feature for classification.
  • ...and 5 more figures