Combining T-learning and DR-learning: a framework for oracle-efficient estimation of causal contrasts

Lars van der Laan; Marco Carone; Alex Luedtke

Combining T-learning and DR-learning: a framework for oracle-efficient estimation of causal contrasts

Lars van der Laan, Marco Carone, Alex Luedtke

TL;DR

The paper tackles estimating heterogeneous causal contrasts such as the conditional average treatment effect and conditional relative risk. It introduces EP-learning, an efficient plug-in risk framework that preserves the stability of plug-in methods while achieving the oracle efficiency of Neyman-orthogonal approaches. By constructing a refined outcome regression $\mu_n^*$ via sieve-based debiasing, EP-learning yields doubly robust estimators for CATE and CRR and provides rigorous nonparametric efficiency guarantees with cross-fitting and sieve methods. Empirical results show EP-learners outperform state-of-the-art competitors across low- and high-dimensional settings, suggesting substantial practical improvements for causal inference in observational and experimental data.

Abstract

We introduce efficient plug-in (EP) learning, a novel framework for the estimation of heterogeneous causal contrasts, such as the conditional average treatment effect and conditional relative risk. The EP-learning framework enjoys the same oracle-efficiency as Neyman-orthogonal learning strategies, such as DR-learning and R-learning, while addressing some of their primary drawbacks, including that (i) their practical applicability can be hindered by loss function non-convexity; and (ii) they may suffer from poor performance and instability due to inverse probability weighting and pseudo-outcomes that violate bounds. To avoid these drawbacks, EP-learner constructs an efficient plug-in estimator of the population risk function for the causal contrast, thereby inheriting the stability and robustness properties of plug-in estimation strategies like T-learning. Under reasonable conditions, EP-learners based on empirical risk minimization are oracle-efficient, exhibiting asymptotic equivalence to the minimizer of an oracle-efficient one-step debiased estimator of the population risk function. In simulation experiments, we illustrate that EP-learners of the conditional average treatment effect and conditional relative risk outperform state-of-the-art competitors, including T-learner, R-learner, and DR-learner. Open-source implementations of the proposed methods are available in our R package hte3.

Combining T-learning and DR-learning: a framework for oracle-efficient estimation of causal contrasts

TL;DR

via sieve-based debiasing, EP-learning yields doubly robust estimators for CATE and CRR and provides rigorous nonparametric efficiency guarantees with cross-fitting and sieve methods. Empirical results show EP-learners outperform state-of-the-art competitors across low- and high-dimensional settings, suggesting substantial practical improvements for causal inference in observational and experimental data.

Abstract

Paper Structure (36 sections, 26 theorems, 200 equations, 8 figures, 3 algorithms)

This paper contains 36 sections, 26 theorems, 200 equations, 8 figures, 3 algorithms.

Introduction
Background
Our contributions
Problem setup
Data structure and notation
Statistical goal
Our proposed approach: EP-learning
Limitations of existing Neyman-orthogonal learning strategies
Sensitivity of CATE DR-learner to large propensity weights
Nonconvexity of Neyman-orthogonal CRR loss function
Proposed approach: EP-learning algorithm
Theoretical guarantees
Efficiency of the EP risk estimator
Convergence rate guarantees for ERM-based EP-learners
Numerical experiments
...and 21 more sections

Key Result

Theorem 1

Suppose Condition cond::pos holds. Then, for an arbitrary element $\theta \in \overline{\mathcal{F}}$, the nonparametric efficient influence function of $P' \mapsto R_{P'}(\theta)$ at $P \in \mathcal{M}$ is given by $D_{P,\theta}$.

Figures (8)

Figure 1: CATE estimates based on EP-learner, DR-learner, and T-learner with random forests and various maximum tree depths computed on a single dataset. (Top) EP-learner, DR-Learner, and T-learner CATE estimates with 10-fold cross-validated maximum tree depth; mean squared prediction errors are 0.0021, 0.012, and 0.005, respectively. Observed covariate distribution is depicted in gray. (Bottom) EP-learner, DR-learner,and T-learner CATE estimate for maximum tree depths of 1, 2, 4 and 7.
Figure 2: (Left) Common R packages for logistic regression may or may not allow negative weights and outcomes outside of [0,1]; a checkmark indicates that they do. ($\ast$) The xgboost package has many built-in loss functions, but they do not accept negative weights. However, these negative weights can be absorbed into a custom loss function. (Right) Example dataset for estimating the CRR using the one-step efficient risk estimator.
Figure 3: CATE experiments with a complex CATE and moderate treatment overlap: Mean squared error for DR-learner, R-learner, T-learner, and cross-validated EP-learner with supervised learning algorithm GAM, MARS, ranger, and xgboost. For the plots reporting the results of ranger, we also display the results of causal forests for comparison.
Figure 4: CRR experiments with a complex CRR and moderate (left) and limited (right) treatment overlap: Mean-squared error for IPW-learner, T-learner, and CV-EP-learner with supervised learning algorithm GAM, random forests, and xgboost. The DR-learner algorithm for the CRR is not implemented due to nonconvexity of the loss.
Figure 5: (Left) Empirical reverse cumulative distribution function (RCDF) of estimated DR pseudo-outcome values for the simulated dataset. (Right) A zoomed in empirical RCDF. The 1% and 99% empirical quantiles of the pseudo-outcome are -3.2 and 3.1 while the maximum and minimum values are -16 and 14.
...and 3 more figures

Theorems & Definitions (53)

example 1: label=ex1, name=conditional average treatment effect
example 2: label=ex2, name=conditional relative risk
Theorem 1
example 3: continues = ex1, , name=conditional average treatment effect
Theorem 2: Oracle efficiency of EP-learner risk
Theorem 3: Oracle learner rate
Theorem 4: EP-learner convergence rate for a deterministic sieve growth rate
Theorem 5: Oracle efficiency of ERM-based EP-learner
Theorem 6
proof : Proof of Theorem \ref{['theorem:knn']}
...and 43 more

Combining T-learning and DR-learning: a framework for oracle-efficient estimation of causal contrasts

TL;DR

Abstract

Combining T-learning and DR-learning: a framework for oracle-efficient estimation of causal contrasts

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (53)