C-XGBoost: A tree boosting model for causal effect estimation
Niki Kiriakidou, Ioannis E. Livieris, Christos Diou
TL;DR
This work tackles estimating causal effects from observational data by introducing C-XGBoost, a multi-output XGBoost-based model that predicts potential outcomes for both treatment and control using a dedicated loss function. The method combines the predictive strengths of tree ensembles with representation-learning aspects of causal neural nets, while offering robust handling of missing data and built-in regularization. Empirical evaluation on semi-synthetic datasets (Synthetic and ACIC) shows C-XGBoost achieves state-of-the-art or competitive performance in both ATE error and PEHE, supported by rigorous statistical testing. The work provides practical advances for causal inference in tabular data, with implications for safety-critical applications where counterfactual reasoning is essential.
Abstract
Causal effect estimation aims at estimating the Average Treatment Effect as well as the Conditional Average Treatment Effect of a treatment to an outcome from the available data. This knowledge is important in many safety-critical domains, where it often needs to be extracted from observational data. In this work, we propose a new causal inference model, named C-XGBoost, for the prediction of potential outcomes. The motivation of our approach is to exploit the superiority of tree-based models for handling tabular data together with the notable property of causal inference neural network-based models to learn representations that are useful for estimating the outcome for both the treatment and non-treatment cases. The proposed model also inherits the considerable advantages of XGBoost model such as efficiently handling features with missing values requiring minimum preprocessing effort, as well as it is equipped with regularization techniques to avoid overfitting/bias. Furthermore, we propose a new loss function for efficiently training the proposed causal inference model. The experimental analysis, which is based on the performance profiles of Dolan and Mor{é} as well as on post-hoc and non-parametric statistical tests, provide strong evidence about the effectiveness of the proposed approach.
