A primer on optimal transport for causal inference with observational data
Florian F Gunsilius
TL;DR
This paper surveys the deep connections between optimal transport and causal inference with observational data, arguing that OT provides a foundational language for identifying and bounding causal effects under endogeneity. It develops the role of monotone rearrangements (and their multivariate Brenier-map generalizations) as a structural core for linking unobservables to outcomes, and shows how classic identification strategies (IV, DID, and synthetic controls) can be reframed within an OT framework. Key contributions include clarifying when the full causal mechanism is identifiable (under exogeneity or Brenier map structure), deriving tight distributional bounds via path-space OT, and extending methods to nonlinear and multivariate settings through comonotonicity and barycenters. The review also highlights practical tools, such as control variables and distributionally robust methods, that leverage OT to handle weak instruments, limited support, and distributional heterogeneity. Overall, the work provides a unifying perspective that connects causality, probability, and optimization, with implications for both theory and applied econometrics.
Abstract
The theory of optimal transportation has developed into a powerful and elegant framework for comparing probability distributions, with wide-ranging applications in all areas of science. The fundamental idea of analyzing probabilities by comparing their underlying state space naturally aligns with the core idea of causal inference, where understanding and quantifying counterfactual states is paramount. Despite this intuitive connection, explicit research at the intersection of optimal transport and causal inference is only beginning to develop. Yet, many foundational models in causal inference have implicitly relied on optimal transport principles for decades, without recognizing the underlying connection. Therefore, the goal of this review is to offer an introduction to the surprisingly deep existing connections between optimal transport and the identification of causal effects with observational data -- where optimal transport is not just a set of potential tools, but actually builds the foundation of model assumptions. As a result, this review is intended to unify the language and notation between different areas of statistics, mathematics, and econometrics, by pointing out these existing connections, and to explore novel problems and directions for future work in both areas derived from this realization.
