$χ$SPN: Characteristic Interventional Sum-Product Networks for Causal Inference in Hybrid Domains
Harsh Poonia, Moritz Willig, Zhongjie Yu, Matej Zečević, Kristian Kersting, Devendra Singh Dhami
TL;DR
This work addresses causal inference in hybrid domains with mixed discrete and continuous variables by proposing χSPN, a Characteristic Interventional Sum-Product Network. χSPN embeds leaves with univariate characteristic functions and learns the root distribution's characteristic function conditioned on interventions via a neural network, enabling tractable inference of interventional distributions even when closed-form densities are unavailable. The approach leverages the Empirical Characteristic Function for training and uses CFD as the training objective to match interventional distributions, with inversion techniques to recover joint densities. The paper demonstrates that χSPN can generalize to multiple interventions from training on a single intervention and shows promising results on three synthetic heterogeneous datasets, highlighting its potential for causal reasoning in realistic mixed-data settings.
Abstract
Causal inference in hybrid domains, characterized by a mixture of discrete and continuous variables, presents a formidable challenge. We take a step towards this direction and propose Characteristic Interventional Sum-Product Network ($χ$SPN) that is capable of estimating interventional distributions in presence of random variables drawn from mixed distributions. $χ$SPN uses characteristic functions in the leaves of an interventional SPN (iSPN) thereby providing a unified view for discrete and continuous random variables through the Fourier-Stieltjes transform of the probability measures. A neural network is used to estimate the parameters of the learned iSPN using the intervened data. Our experiments on 3 synthetic heterogeneous datasets suggest that $χ$SPN can effectively capture the interventional distributions for both discrete and continuous variables while being expressive and causally adequate. We also show that $χ$SPN generalize to multiple interventions while being trained only on a single intervention data.
