Rethinking Domain Generalization: Discriminability and Generalizability

Shaocong Long; Qianyu Zhou; Chenhao Ying; Lizhuang Ma; Yuan Luo

Rethinking Domain Generalization: Discriminability and Generalizability

Shaocong Long, Qianyu Zhou, Chenhao Ying, Lizhuang Ma, Yuan Luo

TL;DR

This paper addresses domain generalization by tackling the tension between discriminability and generalizability, noting that traditional DG methods often degrade discriminability via spurious correlations. It proposes DMDA, a framework composed of Selective Channel Pruning (SCP) to remove unstable channels and Micro-level Distribution Alignment (MDA) to align semantics at a finer granularity across domains using latent semantics from domain-specific experts. The optimization couples a classification objective with a semantic-aware invariance term implemented as a minimax game against a distribution approximator, promoting micro-level domain alignment while preserving informative features. Empirical results across five benchmarks show DMDA achieving competitive or superior performance, with pronounced gains in hard transfer scenarios, and ablations confirm the complementary roles of SCP and MDA in enhancing generalization. The approach offers a principled path toward DG by emphasizing stable factor selection and micro-level semantics-aware alignment, with potential impact on robust cross-domain learning in vision tasks.

Abstract

Domain generalization(DG) endeavors to develop robust models that possess strong generalizability while preserving excellent discriminability. Nonetheless, pivotal DG techniques tend to improve the feature generalizability by learning domain-invariant representations, inadvertently overlooking the feature discriminability. On the one hand, the simultaneous attainment of generalizability and discriminability of features presents a complex challenge, often entailing inherent contradictions. This challenge becomes particularly pronounced when domain-invariant features manifest reduced discriminability owing to the inclusion of unstable factors, i.e., spurious correlations. On the other hand, prevailing domain-invariant methods can be categorized as category-level alignment, susceptible to discarding indispensable features possessing substantial generalizability and narrowing intra-class variations. To surmount these obstacles, we rethink DG from a new perspective that concurrently imbues features with formidable discriminability and robust generalizability, and present a novel framework, namely, Discriminative Microscopic Distribution Alignment~(DMDA). DMDA incorporates two core components: Selective Channel Pruning~(SCP) and Micro-level Distribution Alignment~(MDA). Concretely, SCP attempts to curtail redundancy within neural networks, prioritizing stable attributes conducive to accurate classification. This approach alleviates the adverse effect of spurious domain invariance and amplifies the feature discriminability. Besides, MDA accentuates micro-level alignment within each class, going beyond mere category-level alignment. Extensive experiments on four benchmark datasets corroborate that DMDA achieves comparable results to state-of-the-art methods in DG, underscoring the efficacy of our method.

Rethinking Domain Generalization: Discriminability and Generalizability

TL;DR

Abstract

Paper Structure (17 sections, 2 theorems, 17 equations, 10 figures, 10 tables, 1 algorithm)

This paper contains 17 sections, 2 theorems, 17 equations, 10 figures, 10 tables, 1 algorithm.

Introduction
Related Work
Methodology
Preliminaries
Selective Channel Pruning
Micro-level Distribution Alignment
Optimization Objective
Experiments
Datasets
Implementation Details
Comparison Results with State-of-the-Art Techniques
Ablation Study
Empirical Analysis
Conclusion
Proofs
...and 2 more sections

Key Result

Proposition 1

Let $\Phi^{\prime} = f(X)$ for a fixed representation function $f$ and $S^{\prime} = \{E_i(\Phi)\}_{i = 1}^M$ for fixed semantics experts $\{E_i\}_{i = 1}^M$, then the optimal probability $D^{\ast}$ for the inner maximization in Eq. minmax is

Figures (10)

Figure 1: Almost all DG methods tend to improve the feature generalizability by learning domain-invariant representations, and inadvertently overlook the feature discriminability, leading to spurious domain invariance. (a) To boost the feature discriminability, we propose Selective Channel Pruning (SCP) to filter out unstable factors, i.e., spurious correlations, thereby mitigating the adverse effects of such correlations. (b) Besides, we introduce Micro-level Distribution Alignment (MDA) to prevent the risk of discarding indispensable generalizable features in previous category-level distribution alignment. MDA could accommodate sufficient generalizable features while simultaneously enhancing within-class variations, thereby promoting feature generalizability.
Figure 2: Comparisons of classification error rates on the feature representations. The error rate measures the disciriminability of acquired features.
Figure 3: The channel activation frequency in the penultimate layer of ResNet-18 trained via ERM, with 'Art', 'Cartoon', 'Photo', and 'Sketch' as the target domain, respectively. Channels are arranged in descending order based on the activation frequency in source domains. Channels experiencing infrequent activation in source domains but displaying frequent activation in target domain can be characterized as unstable factors demonstrating spurious correlations.
Figure 4: Framework of our proposed Discriminative Microscopic Distribution Alignment (DMDA). The features generated by the feature extractor are initially transmitted to the Selective Channel Pruning (SCP) module to create channel-wise masks for the purpose of eliminating unstable channels. Following this, the features undergo pruning based on the channel-wise masks. Subsequently, Micro-level Distribution Alignment (MDA) is employed to execute micro-level distribution alignment rooted in the latent semantics of the pruned features, which are generated by additional semantics experts.
Figure 5: Visualizations with t-SNE embeddings van2008visualizing depicting distinct classes' representations produced by (a) ERM, (b) SCP, (c) MDA, and (d) the combination DMDA, respectively, with 'Photo' as the target domain. Zoom in for details.
...and 5 more figures

Theorems & Definitions (4)

Proposition 1
Theorem 1
Proof A.1
Proof A.2

Rethinking Domain Generalization: Discriminability and Generalizability

TL;DR

Abstract

Rethinking Domain Generalization: Discriminability and Generalizability

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (4)