Beyond Conditional Averages: Estimating The Individual Causal Effect Distribution
Richard Post, Edwin van den Heuvel
TL;DR
The paper tackles the challenge that variability in individual causal effects (ICEs) can render average effects uninformative for individuals. It identifies conditions under which the ICE distribution is identifiable from cross-sectional data, notably a conditional independence assumption $Y^1-Y^0 \perp Y^0 \mid \boldsymbol{X}$, and develops latent-variable causal mixed models to recover the ICE distribution via latent components $N_Y$ and $U_1$. By reframing ICE estimation as a deconvolution-like problem and using flexible Gaussian mixture random effects, the authors provide a practical methodology to quantify the ICE distribution and its tail properties. The Framingham case study on Hepatic Steatosis and LVFP demonstrates that about 20.6% of individuals may experience harm exceeding twice the average effect, underscoring the added value of distributional causal analysis for precision medicine and risk stratification.
Abstract
In recent years, the field of causal inference from observational data has emerged rapidly. The literature has focused on (conditional) average causal effect estimation. When (remaining) variability of individual causal effects (ICEs) is considerable, average effects may be uninformative for an individual. The fundamental problem of causal inference precludes estimating the joint distribution of potential outcomes without making assumptions. In this work, we show that the ICE distribution is identifiable under (conditional) independence of the individual effect and the potential outcome under no exposure, in addition to the common assumptions of consistency, positivity, and conditional exchangeability. Moreover, we present a family of flexible latent variable models that can be used to study individual effect modification and estimate the ICE distribution from cross-sectional data. How such latent variable models can be applied and validated in practice is illustrated in a case study on the effect of Hepatic Steatosis on a clinical precursor to heart failure. Under the assumptions presented, we estimate that 20.6% (95% Bayesian credible interval: 8.9%, 33.6%) of the population has a harmful effect greater than twice the average causal effect.
