Estimating Joint interventional distributions from marginal interventional data
Sergio Hernan Garrido Mejia, Elke Kirschbaum, Armin Kekić, Atalanti Mastakouri
TL;DR
The paper addresses learning the joint interventional distribution from marginal interventional data by extending the Causal Maximum Entropy framework to i-CMAXENT. It proves that the resulting model remains in the exponential family and provides identifiability results for causal feature selection and for inferring joint interventional effects from single-variable interventions. Key contributions include a principled method to merge marginal data for joint interventional inference, identifiability conditions for adjusting sets, and empirical evidence that i-CMAXENT outperforms CMAXENT and competes with KCI when joint observations are unavailable. This work enables data fusion across disjoint experiments and offers a foundational tool for identifying joint causal effects from marginal information. The approach broadens causal discovery and effect estimation under the causal marginal problem by leveraging interventional data without requiring full joint observations.
Abstract
In this paper we show how to exploit interventional data to acquire the joint conditional distribution of all the variables using the Maximum Entropy principle. To this end, we extend the Causal Maximum Entropy method to make use of interventional data in addition to observational data. Using Lagrange duality, we prove that the solution to the Causal Maximum Entropy problem with interventional constraints lies in the exponential family, as in the Maximum Entropy solution. Our method allows us to perform two tasks of interest when marginal interventional distributions are provided for any subset of the variables. First, we show how to perform causal feature selection from a mixture of observational and single-variable interventional data, and, second, how to infer joint interventional distributions. For the former task, we show on synthetically generated data, that our proposed method outperforms the state-of-the-art method on merging datasets, and yields comparable results to the KCI-test which requires access to joint observations of all variables.
