Do Finetti: On Causal Effects for Exchangeable Data
Siyuan Guo, Chi Zhang, Karthika Mohan, Ferenc Huszár, Bernhard Schölkopf
TL;DR
The work extends causal effect estimation beyond i.i.d. data to exchangeable data generated by independent causal mechanisms (ICM). It introduces a generalized truncated factorization for ICM-generative processes, formalizes interventions via delta-distributions, and proves identifiability of causal effects from exchangeable data. A causal Pólya urn model illustrates how post-interventional distributions can depend on conditioning on other observations, and the Do-Finetti algorithm enables simultaneous causal graph discovery and effect estimation from multi-environment data. Collectively, these results enable principled causal analysis on realistic non-i.i.d. settings common in multi-environment studies. This framework broadens the applicability of causal inference to complex, structured data encountered in health, biology, and machine learning systems.
Abstract
We study causal effect estimation in a setting where the data are not i.i.d. (independent and identically distributed). We focus on exchangeable data satisfying an assumption of independent causal mechanisms. Traditional causal effect estimation frameworks, e.g., relying on structural causal models and do-calculus, are typically limited to i.i.d. data and do not extend to more general exchangeable generative processes, which naturally arise in multi-environment data. To address this gap, we develop a generalized framework for exchangeable data and introduce a truncated factorization formula that facilitates both the identification and estimation of causal effects in our setting. To illustrate potential applications, we introduce a causal Pólya urn model and demonstrate how intervention propagates effects in exchangeable data settings. Finally, we develop an algorithm that performs simultaneous causal discovery and effect estimation given multi-environment data.
