Table of Contents
Fetching ...

Partial Transportability for Domain Generalization

Kasra Jalaldoust, Alexis Bellot, Elias Bareinboim

TL;DR

This paper introduces new results for bounding the value of a functional of the target distribution, such as the generalization error of a classifier, given data from source domains and assumptions about the data generating mechanisms, encoded in causal diagrams.

Abstract

A fundamental task in AI is providing performance guarantees for predictions made in unseen domains. In practice, there can be substantial uncertainty about the distribution of new data, and corresponding variability in the performance of existing predictors. Building on the theory of partial identification and transportability, this paper introduces new results for bounding the value of a functional of the target distribution, such as the generalization error of a classifier, given data from source domains and assumptions about the data generating mechanisms, encoded in causal diagrams. Our contribution is to provide the first general estimation technique for transportability problems, adapting existing parameterization schemes such Neural Causal Models to encode the structural constraints necessary for cross-population inference. We demonstrate the expressiveness and consistency of this procedure and further propose a gradient-based optimization scheme for making scalable inferences in practice. Our results are corroborated with experiments.

Partial Transportability for Domain Generalization

TL;DR

This paper introduces new results for bounding the value of a functional of the target distribution, such as the generalization error of a classifier, given data from source domains and assumptions about the data generating mechanisms, encoded in causal diagrams.

Abstract

A fundamental task in AI is providing performance guarantees for predictions made in unseen domains. In practice, there can be substantial uncertainty about the distribution of new data, and corresponding variability in the performance of existing predictors. Building on the theory of partial identification and transportability, this paper introduces new results for bounding the value of a functional of the target distribution, such as the generalization error of a classifier, given data from source domains and assumptions about the data generating mechanisms, encoded in causal diagrams. Our contribution is to provide the first general estimation technique for transportability problems, adapting existing parameterization schemes such Neural Causal Models to encode the structural constraints necessary for cross-population inference. We demonstrate the expressiveness and consistency of this procedure and further propose a gradient-based optimization scheme for making scalable inferences in practice. Our results are corroborated with experiments.

Paper Structure

This paper contains 34 sections, 8 theorems, 73 equations, 16 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

Consider the tuple of SCMs $\mathbb{M}$ that induces the selection diagram $\mathcal{G}^{{\mathbf{\Delta}}}$ over the variables $\V$, and entails the source distributions $\mathbb{P}$, and the target distribution $P^*$. Let $\psi: \Omega_\V \to \mathbb{R}$ be a functional of interest. Consider the f where each $\mathcal{N}^i$ is a canonical model characterized by a joint distribution over $\{R_V\}

Figures (16)

  • Figure 1: Illustration of the task of evaluating the generalization error of a model $h$. The mechanisms for $C$ and $W$ vary across domains.
  • Figure 2: Selection diagram & Canonical param.
  • Figure 3: Selection diagram for \ref{['ex:neural-TR']}.
  • Figure 4: (\ref{['fig:simulations:a']}-\ref{['fig:simulations:c']}): worst-case risk evaluation results as a function of Neural-TR (\ref{['alg:partialTR']}) training iterations. (\ref{['fig:cro:a']},\ref{['fig:cro:b']}): worst-case risk evaluation of CRO.
  • Figure 5: $\mathcal{G}^{\mathbf{\Delta}}_{\text{CMNIST}}$
  • ...and 11 more figures

Theorems & Definitions (34)

  • Definition 1
  • Example 1: Covariate shift
  • Definition 2: Domain discrepancy
  • Definition 3: Selection diagram
  • Example 2: Generalization performance of classifiers
  • Definition 4: Partial Transportability
  • Example 3: The bow model
  • Definition 5: Canonical SCM
  • Example 3: continued
  • Theorem 1: Partial-TR with canonical models
  • ...and 24 more