Table of Contents
Fetching ...

Counterfactual Identifiability via Dynamic Optimal Transport

Fabio De Sousa Ribeiro, Ainkaran Santhirasekaram, Ben Glocker

TL;DR

This work addresses the challenge of identifying counterfactuals for high-dimensional multivariate outcomes from observational data by marrying continuous-time flows with dynamic optimal transport. It shows that under Markovian SCMs and standard regularity, the dynamic OT flow yields a unique, monotone counterfactual transport map that preserves rank in multivariate settings, extending the Monge–Kantorovich framework via the Brenier gradient map and the Benamou–Brenier formulation. The authors extend identifiability results to non-Markovian regimes (IV, BC, FC) and demonstrate that a Markovian OT coupling can be learned in practice through flow matching, enabling counterfactual inference without paired data. Empirically, they validate the theory on a counterfactual ellipse dataset and a MIMIC Chest X-ray study, showing improved counterfactual soundness (composition, effectiveness, reversibility) over prior methods and highlighting practical considerations for OT scaling. Overall, the paper provides a rigorous foundation for high-dimensional counterfactual identifiability and a practical CF inference pipeline with strong implications for causal reasoning in imaging and beyond, while noting current scalability limitations of large-scale OT.

Abstract

We address the open question of counterfactual identification for high-dimensional multivariate outcomes from observational data. Pearl (2000) argues that counterfactuals must be identifiable (i.e., recoverable from the observed data distribution) to justify causal claims. A recent line of work on counterfactual inference shows promising results but lacks identification, undermining the causal validity of its estimates. To address this, we establish a foundation for multivariate counterfactual identification using continuous-time flows, including non-Markovian settings under standard criteria. We characterise the conditions under which flow matching yields a unique, monotone and rank-preserving counterfactual transport map with tools from dynamic optimal transport, ensuring consistent inference. Building on this, we validate the theory in controlled scenarios with counterfactual ground-truth and demonstrate improvements in axiomatic counterfactual soundness on real images.

Counterfactual Identifiability via Dynamic Optimal Transport

TL;DR

This work addresses the challenge of identifying counterfactuals for high-dimensional multivariate outcomes from observational data by marrying continuous-time flows with dynamic optimal transport. It shows that under Markovian SCMs and standard regularity, the dynamic OT flow yields a unique, monotone counterfactual transport map that preserves rank in multivariate settings, extending the Monge–Kantorovich framework via the Brenier gradient map and the Benamou–Brenier formulation. The authors extend identifiability results to non-Markovian regimes (IV, BC, FC) and demonstrate that a Markovian OT coupling can be learned in practice through flow matching, enabling counterfactual inference without paired data. Empirically, they validate the theory on a counterfactual ellipse dataset and a MIMIC Chest X-ray study, showing improved counterfactual soundness (composition, effectiveness, reversibility) over prior methods and highlighting practical considerations for OT scaling. Overall, the paper provides a rigorous foundation for high-dimensional counterfactual identifiability and a practical CF inference pipeline with strong implications for causal reasoning in imaging and beyond, while noting current scalability limitations of large-scale OT.

Abstract

We address the open question of counterfactual identification for high-dimensional multivariate outcomes from observational data. Pearl (2000) argues that counterfactuals must be identifiable (i.e., recoverable from the observed data distribution) to justify causal claims. A recent line of work on counterfactual inference shows promising results but lacks identification, undermining the causal validity of its estimates. To address this, we establish a foundation for multivariate counterfactual identification using continuous-time flows, including non-Markovian settings under standard criteria. We characterise the conditions under which flow matching yields a unique, monotone and rank-preserving counterfactual transport map with tools from dynamic optimal transport, ensuring consistent inference. Building on this, we validate the theory in controlled scenarios with counterfactual ground-truth and demonstrate improvements in axiomatic counterfactual soundness on real images.

Paper Structure

This paper contains 49 sections, 15 theorems, 92 equations, 15 figures, 12 tables.

Key Result

Proposition 4.4

If a mechanism $f(\mathbf{pa}, u)$ is monotone in $u$ (Def. def:monotone_op), then the respective counterfactual transport map $T^*(\mathbf{pa}^*, \mathbf{pa}, x)$ is monotone in $x$.

Figures (15)

  • Figure 1: Four canonical causal graphs. From left to right: (i) Markovian, (ii) Instrumental Variable, (iii) Backdoor, and (iv) Frontdoor. Dashed bidirected arcs denote unobserved confounding.
  • Figure 2: Counterfactual ellipse generation. (Top) Comparing our OT coupling flow to the naive approach (Section \ref{['sec:cf_maps_prescription']}). (Bottom) OT maps exhibit greater counterfactual reversibility. A counterfactual $x^* = T_{\operatorname{pa}^*} \circ T^{-1}_{\operatorname{pa}} (x)$, is reversed by $x_r = T_{\operatorname{pa}} \circ T^{-1}_{\operatorname{pa}^*} (x^*)$, and a perfect reversal squares the circle.
  • Figure 3: (a) CF error; (b)curl of the vector field during training.
  • Figure 4: Qualitative counterfactual inference results on MIMIC Chest X-ray. We observe faithful, reversible interventions without requiring counterfactual fine-tuning, or classifier(-free) guidance.
  • Figure 5: Visualising the curl ($\nabla \times v_t$) of the learned vector field of different models over time (scaled to $[-1, 1]$), for a given intervention on the parents $\operatorname{PA}=-1$. We can see that the EBM variants are curl-free (irrotational) by design, as the vector field is given by the gradient of a scalar potential. We also see that the OT vector fields are smoother and $\textsc{OT-Flow}$ exhibits milder 'rotations' compared to $\textsc{Flow}$. This is consistent with convergence to the Brenier map, which is itself curl-free.
  • ...and 10 more figures

Theorems & Definitions (41)

  • Definition 3.1: Structural Causal Model (SCM) pearl2009causality
  • Definition 4.1: Markovian SCM
  • Definition 4.2: Counterfactual Identifiability pearl2009causality
  • Definition 4.3: Monotone Operator
  • Proposition 4.4: Monotone Counterfactual Transport Map
  • Remark 4.5
  • Lemma 4.6: Unique and Monotone Dynamic OT Mechanism
  • Remark 4.7
  • Definition 4.8: Transport $\mathcal{L}_3$-Equivalence
  • Definition 4.8: Transport $\mathcal{L}_3$-Equivalence
  • ...and 31 more