Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis
Daniel Csillag, Claudio José Struchiner, Guilherme Tegoni Goedert
TL;DR
Given finite samples, the paper addresses generalization in causal regression by deriving a change-of-measure bound based on the Pearson $\chi^2$ divergence that ties the unobservable complete causal loss to an observable, reweighted loss plus a gap term $\Delta_{T=a}$ and a variance term. The results cover outcome regression and causal meta-learners (T-, S-, X-learners) and extend to losses beyond MSE, such as MAE and quantile loss, enabling estimation of robust and quantile treatment effects under weak ignorability and positivity assumptions. A practical, empirical upper bound on $\Delta_{T=a}$ uses a propensity-model Brier score, enabling data-driven model selection and sensitivity analysis in semi-synthetic and real datasets (e.g., Parkinson's dataset). Experiments show the bounds are remarkably tight, often beating prior bounds by orders of magnitude and guiding algorithm design and model selection in causal inference.
Abstract
Many algorithms have been recently proposed for causal machine learning. Yet, there is little to no theory on their quality, especially considering finite samples. In this work, we propose a theory based on generalization bounds that provides such guarantees. By introducing a novel change-of-measure inequality, we are able to tightly bound the model loss in terms of the deviation of the treatment propensities over the population, which we show can be empirically limited. Our theory is fully rigorous and holds even in the face of hidden confounding and violations of positivity. We demonstrate our bounds on semi-synthetic and real data, showcasing their remarkable tightness and practical utility.
