Table of Contents
Fetching ...

Domain Generalization using Causal Matching

Divyat Mahajan, Shruti Tople, Amit Sharma

TL;DR

This work reframes domain generalization through a causal lens, arguing that class-conditioned invariance is insufficient when stable, within-class features vary across domains. It introduces an object-centered invariant based on a structural causal model and proposes two matching-based methods: PerfMatch (when object matches are observed) and MatchDG (without observed objects), with MDGHybrid leveraging data augmentations. Empirically, MatchDG and MDGHybrid achieve competitive out-of-domain accuracy on Rotated MNIST/Fashion-MNIST, PACS, and Chest X-rays, and MatchDG recovers ground-truth object matches with meaningful overlap. Overall, the paper shows that enforcing object-level invariance via matching can outperform traditional domain-invariant or class-conditioned objectives in practical domain generalization tasks.

Abstract

In the domain generalization literature, a common objective is to learn representations independent of the domain after conditioning on the class label. We show that this objective is not sufficient: there exist counter-examples where a model fails to generalize to unseen domains even after satisfying class-conditional domain invariance. We formalize this observation through a structural causal model and show the importance of modeling within-class variations for generalization. Specifically, classes contain objects that characterize specific causal features, and domains can be interpreted as interventions on these objects that change non-causal features. We highlight an alternative condition: inputs across domains should have the same representation if they are derived from the same object. Based on this objective, we propose matching-based algorithms when base objects are observed (e.g., through data augmentation) and approximate the objective when objects are not observed (MatchDG). Our simple matching-based algorithms are competitive to prior work on out-of-domain accuracy for rotated MNIST, Fashion-MNIST, PACS, and Chest-Xray datasets. Our method MatchDG also recovers ground-truth object matches: on MNIST and Fashion-MNIST, top-10 matches from MatchDG have over 50% overlap with ground-truth matches.

Domain Generalization using Causal Matching

TL;DR

This work reframes domain generalization through a causal lens, arguing that class-conditioned invariance is insufficient when stable, within-class features vary across domains. It introduces an object-centered invariant based on a structural causal model and proposes two matching-based methods: PerfMatch (when object matches are observed) and MatchDG (without observed objects), with MDGHybrid leveraging data augmentations. Empirically, MatchDG and MDGHybrid achieve competitive out-of-domain accuracy on Rotated MNIST/Fashion-MNIST, PACS, and Chest X-rays, and MatchDG recovers ground-truth object matches with meaningful overlap. Overall, the paper shows that enforcing object-level invariance via matching can outperform traditional domain-invariant or class-conditioned objectives in practical domain generalization tasks.

Abstract

In the domain generalization literature, a common objective is to learn representations independent of the domain after conditioning on the class label. We show that this objective is not sufficient: there exist counter-examples where a model fails to generalize to unseen domains even after satisfying class-conditional domain invariance. We formalize this observation through a structural causal model and show the importance of modeling within-class variations for generalization. Specifically, classes contain objects that characterize specific causal features, and domains can be interpreted as interventions on these objects that change non-causal features. We highlight an alternative condition: inputs across domains should have the same representation if they are derived from the same object. Based on this objective, we propose matching-based algorithms when base objects are observed (e.g., through data augmentation) and approximate the objective when objects are not observed (MatchDG). Our simple matching-based algorithms are competitive to prior work on out-of-domain accuracy for rotated MNIST, Fashion-MNIST, PACS, and Chest-Xray datasets. Our method MatchDG also recovers ground-truth object matches: on MNIST and Fashion-MNIST, top-10 matches from MatchDG have over 50% overlap with ground-truth matches.

Paper Structure

This paper contains 56 sections, 8 theorems, 31 equations, 5 figures, 18 tables, 1 algorithm.

Key Result

Proposition 1

Under the domain generalization setup as above, if $P(X_c|Y)$ remains the same across domains where $x_c$ is the stable feature, then the class-conditional domain-invariant objective for learning representations yields a generalizable classifier such that the learnt representation $\Phi(\mathbf{x})$

Figures (5)

  • Figure 1: Two datasets showing the limitations of class-conditional domain-invariance objective. a) The CDM predictor is domain-invariant given the class label but does not generalize to the target domain; b) Colors denote the two ground-truth class labels. For class prediction, the linear feature exhibits varying level of noise across domains. The stable slab feature also has noise but it is invariant across domains.
  • Figure 2: Structural causal models for the data-generating process. Observed variables are shaded; dashed arrows denote correlated nodes. Object may not be observed.
  • Figure 3: MatchDG regularization penalty is not trivially minimized even as the training error goes to zero.
  • Figure 4: Causal graphs with the node $C$ as a chain, fork, or a collider. By the d-separation criteria, $A$ and $B$ are conditionally independent given $C$ in (a) and (b). In (c) however, $A$ and $B$ are independent but become conditionally dependent given $C$.
  • Figure 5: The t-SNE plots for visualizing features learnt in MatchDG Phase 1. (a)-(c) are for Photo as the target domain and (b)-(d) are for Sketch.

Theorems & Definitions (13)

  • Proposition 1
  • Proposition 2
  • Theorem 1
  • Proposition 3
  • Definition 1
  • Proposition 3
  • proof
  • Proposition 3
  • proof
  • Theorem 1
  • ...and 3 more