RieszBoost: Gradient Boosting for Riesz Regression
Kaitlyn J. Lee, Alejandro Schuler
TL;DR
RieszBoost tackles the challenge of estimating the Riesz representer $\alpha_0(W)$ needed for efficient and doubly robust causal estimators by directly learning $\alpha_0(W)$ through gradient boosting on the Riesz loss $L(\alpha)$. It introduces a data-augmentation strategy to evaluate gradients at counterfactual points, enabling boosting in tabular data without requiring an explicit analytical form of $\alpha_0$. Across simulations with binary and continuous treatments, RieszBoost matches or surpasses indirect nuisance-based approaches in estimating both the Riesz representer and target causal parameters (e.g., ATE, ATT, ASE, LASE), while maintaining favorable coverage and reducing reliance on density estimation. The method integrates seamlessly with downstream EIF-based estimators and cross-fitting, offering a scalable, robust alternative for causal inference in high-dimensional, tabular settings.
Abstract
Answering causal questions often involves estimating linear functionals of conditional expectations, such as the average treatment effect or the effect of a longitudinal modified treatment policy. By the Riesz representation theorem, these functionals can be expressed as the expected product of the conditional expectation of the outcome and the Riesz representer, a key component in doubly robust estimation methods. Traditionally, the Riesz representer is estimated indirectly by deriving its explicit analytical form, estimating its components, and substituting these estimates into the known form (e.g., the inverse propensity score). However, deriving or estimating the analytical form can be challenging, and substitution methods are often sensitive to practical positivity violations, leading to higher variance and wider confidence intervals. In this paper, we propose a novel gradient boosting algorithm to directly estimate the Riesz representer without requiring its explicit analytical form. This method is particularly suited for tabular data, offering a flexible, nonparametric, and computationally efficient alternative to existing methods for Riesz regression. Through simulation studies, we demonstrate that our algorithm performs on par with or better than indirect estimation techniques across a range of functionals, providing a user-friendly and robust solution for estimating causal quantities.
