Table of Contents
Fetching ...

A Meta-learner for Heterogeneous Effects in Difference-in-Differences

Hui Lan, Haoge Chang, Eleanor Dillon, Vasilis Syrgkanis

TL;DR

This work tackles heterogeneous treatment effects in panel data under conditional parallel trends by developing a doubly robust meta-learner for the Conditional Average Treatment Effect on the Treated ($CATT$). It reframes estimation as a convex, Neyman-orthogonal loss problem that remains robust to nuisance-model errors and extends naturally to general conditional functionals under covariate shift, with a unifying approach for multi-period and IV-DID settings. The method yields fast learning rates and practical model-aggregation capabilities, demonstrated through fully synthetic experiments and a real minimum wage case study where it uncovers interpretable heterogeneity patterns, such as the role of county population in modulating effects. Overall, the proposed DR-Learner framework enhances interpretable policy guidance by delivering robust, data-driven estimates of how treatment effects vary across subpopulations in DiD contexts.

Abstract

We address the problem of estimating heterogeneous treatment effects in panel data, adopting the popular Difference-in-Differences (DiD) framework under the conditional parallel trends assumption. We propose a novel doubly robust meta-learner for the Conditional Average Treatment Effect on the Treated (CATT), reducing the estimation to a convex risk minimization problem involving a set of auxiliary models. Our framework allows for the flexible estimation of the CATT, when conditioning on any subset of variables of interest using generic machine learning. Leveraging Neyman orthogonality, our proposed approach is robust to estimation errors in the auxiliary models. As a generalization to our main result, we develop a meta-learning approach for the estimation of general conditional functionals under covariate shift. We also provide an extension to the instrumented DiD setting with non-compliance. Empirical results demonstrate the superiority of our approach over existing baselines.

A Meta-learner for Heterogeneous Effects in Difference-in-Differences

TL;DR

This work tackles heterogeneous treatment effects in panel data under conditional parallel trends by developing a doubly robust meta-learner for the Conditional Average Treatment Effect on the Treated (). It reframes estimation as a convex, Neyman-orthogonal loss problem that remains robust to nuisance-model errors and extends naturally to general conditional functionals under covariate shift, with a unifying approach for multi-period and IV-DID settings. The method yields fast learning rates and practical model-aggregation capabilities, demonstrated through fully synthetic experiments and a real minimum wage case study where it uncovers interpretable heterogeneity patterns, such as the role of county population in modulating effects. Overall, the proposed DR-Learner framework enhances interpretable policy guidance by delivering robust, data-driven estimates of how treatment effects vary across subpopulations in DiD contexts.

Abstract

We address the problem of estimating heterogeneous treatment effects in panel data, adopting the popular Difference-in-Differences (DiD) framework under the conditional parallel trends assumption. We propose a novel doubly robust meta-learner for the Conditional Average Treatment Effect on the Treated (CATT), reducing the estimation to a convex risk minimization problem involving a set of auxiliary models. Our framework allows for the flexible estimation of the CATT, when conditioning on any subset of variables of interest using generic machine learning. Leveraging Neyman orthogonality, our proposed approach is robust to estimation errors in the auxiliary models. As a generalization to our main result, we develop a meta-learning approach for the estimation of general conditional functionals under covariate shift. We also provide an extension to the instrumented DiD setting with non-compliance. Empirical results demonstrate the superiority of our approach over existing baselines.

Paper Structure

This paper contains 22 sections, 15 theorems, 120 equations, 4 figures, 7 tables.

Key Result

Proposition 2.3

Under Assumptions assum:cpta and assum:no-an, the CATT, $\theta_0(X)$, can be identified as: where $g_0(x) := \mathbb{E}[Y_1(0) - Y_0(0)|D=0,W=w]$.

Figures (4)

  • Figure 1: Predicted CATT with respect to log county population.
  • Figure 2: Calibration plot for CATT of minimum wage with respect to log county population.
  • Figure 3: Calibration plot for CATT w.r.t log county population for the XGBoost doubly robust learner.
  • Figure 4: Calibration plot for CATT w.r.t log county population for the linear doubly robust learner.

Theorems & Definitions (35)

  • Proposition 2.3
  • Definition 3.2: Conditional Neyman Othogonality
  • Lemma 3.3: Doubly Robust CATT on Subspace of Covariates
  • Remark 3.4
  • Proposition 3.5
  • Theorem 3.6: CATT Rates
  • Example 4.2: Conditional prediction powered inference
  • Example 4.3: Heterogeneous long-term effects from short-term experiments using historical data
  • Example 4.4: CATE with covariate shift
  • Definition 4.5: Conditional Riesz Representer
  • ...and 25 more