Table of Contents
Fetching ...

Unknown Domain Inconsistency Minimization for Domain Generalization

Seungjae Shin, HeeSun Bae, Byeonghu Na, Yoon-Yeong Kim, Il-Chul Moon

TL;DR

UDIM reduces the loss landscape inconsistency between source domain and unknown domains by aligning the loss landscape acquired in the source domain to the loss landscape of perturbed domains, expected to achieve generalization grounded on these flat minima for the unknown domains.

Abstract

The objective of domain generalization (DG) is to enhance the transferability of the model learned from a source domain to unobserved domains. To prevent overfitting to a specific domain, Sharpness-Aware Minimization (SAM) reduces source domain's loss sharpness. Although SAM variants have delivered significant improvements in DG, we highlight that there's still potential for improvement in generalizing to unknown domains through the exploration on data space. This paper introduces an objective rooted in both parameter and data perturbed regions for domain generalization, coined Unknown Domain Inconsistency Minimization (UDIM). UDIM reduces the loss landscape inconsistency between source domain and unknown domains. As unknown domains are inaccessible, these domains are empirically crafted by perturbing instances from the source domain dataset. In particular, by aligning the loss landscape acquired in the source domain to the loss landscape of perturbed domains, we expect to achieve generalization grounded on these flat minima for the unknown domains. Theoretically, we validate that merging SAM optimization with the UDIM objective establishes an upper bound for the true objective of the DG task. In an empirical aspect, UDIM consistently outperforms SAM variants across multiple DG benchmark datasets. Notably, UDIM shows statistically significant improvements in scenarios with more restrictive domain information, underscoring UDIM's generalization capability in unseen domains. Our code is available at \url{https://github.com/SJShin-AI/UDIM}.

Unknown Domain Inconsistency Minimization for Domain Generalization

TL;DR

UDIM reduces the loss landscape inconsistency between source domain and unknown domains by aligning the loss landscape acquired in the source domain to the loss landscape of perturbed domains, expected to achieve generalization grounded on these flat minima for the unknown domains.

Abstract

The objective of domain generalization (DG) is to enhance the transferability of the model learned from a source domain to unobserved domains. To prevent overfitting to a specific domain, Sharpness-Aware Minimization (SAM) reduces source domain's loss sharpness. Although SAM variants have delivered significant improvements in DG, we highlight that there's still potential for improvement in generalizing to unknown domains through the exploration on data space. This paper introduces an objective rooted in both parameter and data perturbed regions for domain generalization, coined Unknown Domain Inconsistency Minimization (UDIM). UDIM reduces the loss landscape inconsistency between source domain and unknown domains. As unknown domains are inaccessible, these domains are empirically crafted by perturbing instances from the source domain dataset. In particular, by aligning the loss landscape acquired in the source domain to the loss landscape of perturbed domains, we expect to achieve generalization grounded on these flat minima for the unknown domains. Theoretically, we validate that merging SAM optimization with the UDIM objective establishes an upper bound for the true objective of the DG task. In an empirical aspect, UDIM consistently outperforms SAM variants across multiple DG benchmark datasets. Notably, UDIM shows statistically significant improvements in scenarios with more restrictive domain information, underscoring UDIM's generalization capability in unseen domains. Our code is available at \url{https://github.com/SJShin-AI/UDIM}.
Paper Structure (47 sections, 4 theorems, 24 equations, 7 figures, 8 tables, 1 algorithm)

This paper contains 47 sections, 4 theorems, 24 equations, 7 figures, 8 tables, 1 algorithm.

Key Result

Theorem 3.1

rangwani2022closer For $\theta \in \Theta$ and arbitrary domain $\mathscr{D}_{e} \in \mathscr{D}$, with probability at least $1-\delta$ over realized dataset $D_{s}$ from $\mathscr{D}_{s}$ with $|D_{s}|=n$, the following holds under some technical conditions on $\mathcal{L}_{\mathscr{D}_{e}}(\theta)

Figures (7)

  • Figure 1: Illustration of our model, UDIM, based on parameter space (a) and data space (b). (a) We define flatness within a perturbed region by minimizing the inconsistency loss relative to the unknown domains, around the flat region derived from the source domain. (b) Furthermore, by reducing the domain-wise inconsistency within the input perturbed regions, where $\rho_{x}$ denotes perturbation length, our method can also be interpreted as an data space perspective of SAM.
  • Figure 2: Inconsistency score of each method on PACS training dataset (X-axis: training iteration). Y-axis is depicted in a log-scale.
  • Figure 3: (a) Sensitivity analyses of UDIM. (b) test accuracy plot of UDIM and sharpness-based approaches based on training iterations. Shaded regions represent standard deviation.
  • Figure 4: Ablation study of UDIM
  • Figure 5: Sharpness plots for models trained using various methods: the upper plot shows sharpness on the perturbed parameter space, while the lower plot displays sharpness on the perturbed data space. The colormap of each row is normalized into the same scale for fair comparison.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Theorem 3.1
  • Theorem 3.2
  • Theorem B.1
  • Definition B.2
  • Theorem B.5
  • proof