An AI-powered Bayesian generative modeling approach for causal inference in observational studies
Qiao Liu, Wing Hung Wong
TL;DR
CausalBGM tackles causal inference in observational studies with high-dimensional covariates by learning a low-dimensional latent confounder space $Z=(Z_0,Z_1,Z_2,Z_3)$ within a fully Bayesian generative framework. It removes the encoder-decoder loop of prior AI-based methods, performing iterative mini-batch updates and modeling both mean and variance via Bayesian neural networks to yield well-calibrated posterior intervals for ADRF and ITE. The approach achieves superior or competitive accuracy across continuous and binary treatments, demonstrates reliable uncertainty quantification through calibrated posterior intervals, and scales to large datasets, underpinned by EGM-based initialization and a decoupled, parallelizable inference scheme. These contributions offer a principled, scalable, and interpretable tool for modern causal inference in genomics, healthcare, and social sciences, with public code and tutorials available.
Abstract
Causal inference in observational studies with high-dimensional covariates presents significant challenges. We introduce CausalBGM, an AI-powered Bayesian generative modeling approach that captures the causal relationship among covariates, treatment, and outcome. The core innovation is to estimate the individual treatment effect (ITE) by learning the individual-specific distribution of a low-dimensional latent feature set (e.g., latent confounders) that drives changes in both treatment and outcome. This individualized posterior representation yields estimates of the individual treatment effect (ITE) together with well-calibrated posterior intervals while mitigating confounding effect. CausalBGM is fitted through an iterative algorithm to update the model parameters and the latent features until convergence. This framework leverages the power of AI to capture complex dependencies among variables while adhering to the Bayesian principles. Extensive experiments demonstrate that CausalBGM consistently outperforms state-of-the-art methods, particularly in scenarios with high-dimensional covariates and large-scale datasets. By addressing key limitations of existing methods, CausalBGM emerges as a robust and promising framework for advancing causal inference in a wide range of modern applications. The code for CausalBGM is available at https://github.com/liuq-lab/bayesgm. The tutorial for CausalBGM is available at https://causalbgm.readthedocs.io.
