Table of Contents
Fetching ...

Designing Ambiguity Sets for Distributionally Robust Optimization Using Structural Causal Optimal Transport

Ahmad-Reza Ehyaei, Golnoosh Farnadi, Samira Samadi

TL;DR

This work addresses distributional robustness when data follow a structural causal model (SCM) by introducing a structural causal optimal transport (OT) framework. It defines a structural causal ambiguity set $\\mathcal{B}^{\\mathcal{F}}(\\mathbb{P},\\delta)$ that leverages structural equations and a bijective reduced-form mapping $g$ to connect endogenous and exogenous spaces, yielding a more realistic DAS than existing $\\mathcal{A}$ and $\\mathcal{G}$ variants. A relaxed, entropy-regularized version $W^{\\mathcal{F}_{\\varepsilon}}$ enables a difference-of-convex formulation and a Sinkhorn-based algorithm, with finite-sample guarantees when SCMs are estimated and a dimension-free shrinkage rate arising from exogenous independence. The paper demonstrates, via a simple two-variable ANM example, that incorporating structural equations reduces worst-case loss and produces more coherent ambiguity sets, offering a scalable approach for designing DAS under causal structure in DRO problems.

Abstract

Distributionally robust optimization tackles out-of-sample issues like overfitting and distribution shifts by adopting an adversarial approach over a range of possible data distributions, known as the ambiguity set. To balance conservatism and accuracy, these sets must include realistic probability distributions by leveraging information from the nominal distribution. Assuming that nominal distributions arise from a structural causal model with a directed acyclic graph $\mathcal{G}$ and structural equations, previous methods such as adapted and $\mathcal{G}$-causal optimal transport have only utilized causal graph information in designing ambiguity sets. In this work, we propose incorporating structural equations, which include causal graph information, to enhance ambiguity sets, resulting in more realistic distributions. We introduce structural causal optimal transport and its associated ambiguity set, demonstrating their advantages and connections to previous methods. A key benefit of our approach is a relaxed version, where a regularization term replaces the complex causal constraints, enabling an efficient algorithm via difference-of-convex programming to solve structural causal optimal transport. We also show that when structural information is absent and must be estimated, our approach remains effective and provides finite sample guarantees. Lastly, we address the radius of ambiguity sets, illustrating how our method overcomes the curse of dimensionality in optimal transport problems, achieving faster shrinkage with dimension-free order.

Designing Ambiguity Sets for Distributionally Robust Optimization Using Structural Causal Optimal Transport

TL;DR

This work addresses distributional robustness when data follow a structural causal model (SCM) by introducing a structural causal optimal transport (OT) framework. It defines a structural causal ambiguity set that leverages structural equations and a bijective reduced-form mapping to connect endogenous and exogenous spaces, yielding a more realistic DAS than existing and variants. A relaxed, entropy-regularized version enables a difference-of-convex formulation and a Sinkhorn-based algorithm, with finite-sample guarantees when SCMs are estimated and a dimension-free shrinkage rate arising from exogenous independence. The paper demonstrates, via a simple two-variable ANM example, that incorporating structural equations reduces worst-case loss and produces more coherent ambiguity sets, offering a scalable approach for designing DAS under causal structure in DRO problems.

Abstract

Distributionally robust optimization tackles out-of-sample issues like overfitting and distribution shifts by adopting an adversarial approach over a range of possible data distributions, known as the ambiguity set. To balance conservatism and accuracy, these sets must include realistic probability distributions by leveraging information from the nominal distribution. Assuming that nominal distributions arise from a structural causal model with a directed acyclic graph and structural equations, previous methods such as adapted and -causal optimal transport have only utilized causal graph information in designing ambiguity sets. In this work, we propose incorporating structural equations, which include causal graph information, to enhance ambiguity sets, resulting in more realistic distributions. We introduce structural causal optimal transport and its associated ambiguity set, demonstrating their advantages and connections to previous methods. A key benefit of our approach is a relaxed version, where a regularization term replaces the complex causal constraints, enabling an efficient algorithm via difference-of-convex programming to solve structural causal optimal transport. We also show that when structural information is absent and must be estimated, our approach remains effective and provides finite sample guarantees. Lastly, we address the radius of ambiguity sets, illustrating how our method overcomes the curse of dimensionality in optimal transport problems, achieving faster shrinkage with dimension-free order.

Paper Structure

This paper contains 45 sections, 26 theorems, 168 equations, 3 figures, 2 tables, 3 algorithms.

Key Result

Proposition 1

Let $\mathbb{P} \in \mathcal{P}^{\mathsmaller \mathcal{F}}(\mathcal{X})$, then:

Figures (3)

  • Figure 1: The underlying distribution $\mathbb{P}^\ast$ originates from a causal structure. The dark blue region represents $\mathcal{B}(\mathbb{P},\delta)$ the ambiguity set for the classical OT. The light blue region corresponds to $\mathcal{B}^{\mathsmaller \mathcal{A}}(\mathbb{P},\delta)$ the DAS for adapted optimal transport. The light yellow region denotes $\mathcal{B}^{\mathsmaller \mathcal{G}}(\mathbb{P},\delta)$ the DAS for $\mathcal{G}$-causal OT, and the dark yellow region represents $\mathcal{B}^{\mathsmaller \mathcal{F}}(\mathbb{P},\delta)$ the DAS for our structural causal OT with diameter $\delta$.
  • Figure 2: (a) Empirical estimation of true probability distribution for the model $\mathbf{E} = \mathbf{A} + \mathbf{U}_\mathbf{E}$ (Age and Education is normalized). (b) Ambiguity set obtained via classical OT with radius 0.5. (c) Structural causal ambiguity set with radius 0.5. (d) Comparing Worst-case losses for the structural causal and $\mathcal{G}$-causal DAS with radius $\delta = 0.5$ and function $\psi(x,y) = (x - y)^2$.
  • Figure 3: The worst-case loss values are shown for different levels of $\alpha$ across various functions. The red line represents the structural causal ambiguity set loss $\sup_{\mathbb{Q} \in \mathcal{B}^{\mathsmaller \mathcal{F}}(\mathbb{P},\delta)}\mathbb{E}[\psi]$, while the blue line represents the $\mathcal{G}$-causal loss $\sup_{\mathbb{Q} \in \mathcal{B}^{\mathsmaller \mathcal{G}}(\mathbb{P},\delta)}\mathbb{E}[\psi]$.

Theorems & Definitions (36)

  • Definition 1: $\mathcal{F}$-Compatible Measures
  • Proposition 1
  • Definition 2: $\mathcal{F}$-Compatible Plans
  • Proposition 2
  • Proposition 3
  • Definition 3: Structural Causal OT
  • Proposition 4
  • Proposition 5
  • Proposition 6
  • Definition 4: Relaxed Structural Causal OT
  • ...and 26 more