Table of Contents
Fetching ...

Inference for Heterogeneous Treatment Effects with Efficient Instruments and Machine Learning

Cyrill Scheidegger, Zijian Guo, Peter Bühlmann

TL;DR

The paper develops a novel instrumental-variable framework for inference on heterogeneous treatment effects under endogeneity, combining double/debiased machine learning with efficient instruments learned from data and kernel smoothing for univariate heterogeneity. It provides consistency and asymptotic normality results, along with robust weak-IV confidence sets, and extends the methodology to homogeneous effects with practical guidance. Through simulations and two real-data applications, the approach demonstrates accuracy and improved coverage under weak instruments when using robust inference, and clarifies situations where learning efficient instruments offers gains. The work delivers an accessible implementation in the R package IVDML, enabling practitioners to estimate and draw inference on β(v) while flexibly accommodating nuisance-function estimation via modern ML tools.

Abstract

We introduce a new instrumental variable (IV) estimator for heterogeneous treatment effects in the presence of endogeneity. Our estimator is based on double/debiased machine learning (DML) and uses efficient machine learning instruments (MLIV) and kernel smoothing. We prove consistency and asymptotic normality of our estimator and also construct confidence sets that are more robust towards weak IV. Along the way, we also provide an accessible discussion of the corresponding estimator for the homogeneous treatment effect with efficient machine learning instruments. The methods are evaluated on synthetic and real datasets and an implementation is made available in the R package IVDML.

Inference for Heterogeneous Treatment Effects with Efficient Instruments and Machine Learning

TL;DR

The paper develops a novel instrumental-variable framework for inference on heterogeneous treatment effects under endogeneity, combining double/debiased machine learning with efficient instruments learned from data and kernel smoothing for univariate heterogeneity. It provides consistency and asymptotic normality results, along with robust weak-IV confidence sets, and extends the methodology to homogeneous effects with practical guidance. Through simulations and two real-data applications, the approach demonstrates accuracy and improved coverage under weak instruments when using robust inference, and clarifies situations where learning efficient instruments offers gains. The work delivers an accessible implementation in the R package IVDML, enabling practitioners to estimate and draw inference on β(v) while flexibly accommodating nuisance-function estimation via modern ML tools.

Abstract

We introduce a new instrumental variable (IV) estimator for heterogeneous treatment effects in the presence of endogeneity. Our estimator is based on double/debiased machine learning (DML) and uses efficient machine learning instruments (MLIV) and kernel smoothing. We prove consistency and asymptotic normality of our estimator and also construct confidence sets that are more robust towards weak IV. Along the way, we also provide an accessible discussion of the corresponding estimator for the homogeneous treatment effect with efficient machine learning instruments. The methods are evaluated on synthetic and real datasets and an implementation is made available in the R package IVDML.

Paper Structure

This paper contains 45 sections, 15 theorems, 112 equations, 11 figures, 2 tables.

Key Result

Proposition 4

Assume that $\mathbb{E}[\epsilon_i^2|Z_i, X_i]=\mathbb{E}[\epsilon_i^2]$ a.s. Then, $\sigma^2\leq \sigma_\zeta^2$ with equality if and only if there exist $\alpha \neq 0$ and a function $\psi$ such that Here, $\sigma^2$ is as in eq_HomDefSigma, namely the asymptotic variance of the proposed estimator $\hat{\beta}$ in eq_DefEstimator.

Figures (11)

  • Figure 1: Visualization for setting (het.)/(Z lin.) (top) and (het.)/(Z nonlin.) (bottom). On the left, the bandwidth $h$ is chosen according to the normal reference rule / Silverman's rule of thumb. On the right, the bandwidth is chosen according to the normal reference rule multiplied by $N^{1/5}/N^{2/7}$ to ensure undersmoothing. The black solid line is the true heterogeneous treatment effect. Solid lines are the point estimates for the methods (het. T.E. linearIV) (red) and (het. T.E. mlIV) (blue). Dashed lines are the (pointwise) standard 95%-confidence interval. Shaded regions are the (pointwise) robust 95%-confidence sets. Red-shaded regions correspond to (het. T.E. linearIV) and blue-shaded regions correspond to (het. T.E. mlIV). Purple-shaded regions are the intersection of the two.
  • Figure 2: Simulation results for setting (hom.)/(Z lin.) (top) and (hom.)/(Z nonlin.) (bottom). Left panel: mean squared error (MSE) of $\hat{\beta}$ (hom. T.E.), $\hat{\beta}(0)$ (het. T.E. for $v = 0$) and $\hat{\beta}(1.5)$ (het. T.E. for $v = 1.5$). Middle panel: coverage of standard confidence intervals. Right panel: coverage of robust confidence sets. Methods based on (linearIV) are in red and methods based on (mlIV) are in blue. Solid lines correspond to the methods (hom. T.E. linearIV) and (hom. T.E. mlIV). Dashed lines correspond to the methods (het. T.E. linearIV) and (het. T.E. mlIV) for $v = 0$. Dotted lines correspond to the methods (het. T.E. linearIV) and (het. T.E. mlIV) for $v = 1.5$.
  • Figure 3: Simulation results for setting (het.)/(Z lin.) (top) and (het.)/(Z nonlin.) (bottom). Left panel: mean squared error (MSE) of $\hat{\beta}(0)$ (het. T.E. for $v = 0$) and $\hat{\beta}(1.5)$ (het. T.E. for $v = 1.5$). Middle panel: coverage of standard confidence intervals. Right panel: coverage of robust confidence sets. Methods based on (linearIV) are in red and methods based on (mlIV) are in blue. Dashed lines correspond to the methods (het. T.E. linearIV) and (het. T.E. mlIV) for $v = 0$. Dotted lines correspond to the methods (het. T.E. linearIV) and (het. T.E. mlIV) for $v = 1.5$.
  • Figure 4: Heterogeneous treatment effect for the AJR data. Top: nuisance functions are estimated using gam. Bottom: nuisance functions are estimated using xgboost. On the left, the bandwidth $h$ is chosen according to the normal reference rule / Silverman's rule of thumb. On the right, the bandwidth is chosen according to the normal reference rule multiplied by $N^{1/5}/N^{2/7}$ to ensure undersmoothing. Solid lines are the point estimates for the methods (het. T.E. linearIV) (red) and (het. T.E. mlIV) (blue). Dashed lines are the (pointwise) standard 95%-confidence interval. Shaded regions are the (pointwise) robust 95%-confidence sets. For the smaller bandwidth, there is a region around 0.57, for which no results can be estimated because no datapoints are close enough. Red-shaded regions correspond to (het. T.E. linearIV) and blue-shaded regions correspond to (het. T.E. mlIV). Purple-shaded regions are the intersection of the two.
  • Figure 5: Heterogeneous treatment effect for the Card data. Top: nuisance functions are estimated using gam. Bottom: nuisance functions are estimated using xgboost. On the left, the bandwidth $h$ is chosen according to the normal reference rule / Silverman's rule of thumb. On the right, the bandwidth is chosen according to the normal reference rule multiplied by $N^{1/5}/N^{2/7}$ to ensure undersmoothing. Solid lines are the point estimates for the methods (het. T.E. linearIV) (red) and (het. T.E. mlIV) (blue). Dashed lines are the (pointwise) standard 95%-confidence interval. Shaded regions are the (pointwise) robust 95%-confidence sets. Red-shaded regions correspond to (het. T.E. linearIV) and blue-shaded regions correspond to (het. T.E. mlIV). Purple-shaded regions are the intersection of the two.
  • ...and 6 more figures

Theorems & Definitions (29)

  • Remark 1
  • Remark 2
  • Remark 3
  • Proposition 4
  • Remark 5
  • Remark 6
  • Remark 7
  • Theorem 8
  • Theorem 9
  • Remark 10
  • ...and 19 more