Table of Contents
Fetching ...

Data-Driven Estimation of Heterogeneous Treatment Effects

Christopher Tran, Keith Burghardt, Kristina Lerman, Elena Zheleva

TL;DR

This work provides a survey of state-of-the-art data-driven methods for heterogeneous treatment effect estimation using machine learning, broadly categorizing them as methods that focus on counterfactual prediction and methods that directly estimate the causal effect.

Abstract

Estimating how a treatment affects different individuals, known as heterogeneous treatment effect estimation, is an important problem in empirical sciences. In the last few years, there has been a considerable interest in adapting machine learning algorithms to the problem of estimating heterogeneous effects from observational and experimental data. However, these algorithms often make strong assumptions about the observed features in the data and ignore the structure of the underlying causal model, which can lead to biased estimation. At the same time, the underlying causal mechanism is rarely known in real-world datasets, making it hard to take it into consideration. In this work, we provide a survey of state-of-the-art data-driven methods for heterogeneous treatment effect estimation using machine learning, broadly categorizing them as methods that focus on counterfactual prediction and methods that directly estimate the causal effect. We also provide an overview of a third category of methods which rely on structural causal models and learn the model structure from data. Our empirical evaluation under various underlying structural model mechanisms shows the advantages and deficiencies of existing estimators and of the metrics for measuring their performance.

Data-Driven Estimation of Heterogeneous Treatment Effects

TL;DR

This work provides a survey of state-of-the-art data-driven methods for heterogeneous treatment effect estimation using machine learning, broadly categorizing them as methods that focus on counterfactual prediction and methods that directly estimate the causal effect.

Abstract

Estimating how a treatment affects different individuals, known as heterogeneous treatment effect estimation, is an important problem in empirical sciences. In the last few years, there has been a considerable interest in adapting machine learning algorithms to the problem of estimating heterogeneous effects from observational and experimental data. However, these algorithms often make strong assumptions about the observed features in the data and ignore the structure of the underlying causal model, which can lead to biased estimation. At the same time, the underlying causal mechanism is rarely known in real-world datasets, making it hard to take it into consideration. In this work, we provide a survey of state-of-the-art data-driven methods for heterogeneous treatment effect estimation using machine learning, broadly categorizing them as methods that focus on counterfactual prediction and methods that directly estimate the causal effect. We also provide an overview of a third category of methods which rely on structural causal models and learn the model structure from data. Our empirical evaluation under various underlying structural model mechanisms shows the advantages and deficiencies of existing estimators and of the metrics for measuring their performance.
Paper Structure (40 sections, 33 equations, 6 figures, 8 tables)

This paper contains 40 sections, 33 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: A comparison of training and prediction of effects using single-model and two-model approaches. The single-model approach uses one estimator for training, and for prediction, features are appended with a treatment indicator of 1 and a treatment indicator of 0 for two separate predictions based on the single model. Then the effect is estimated as the difference. In the two-model approach, training is done using two separate models for treated and control groups. For prediction, the features are input into the two separate models, and a difference is computed as the predicted effect.
  • Figure 2: A high-level architecture of the Balancing Neural Network (BNN) from johansson-icml16. The first part learns a representation of the features only, $\Phi$. In the second part, the treatment is appended, and the neural network is optimized for prediction performance and minimizing the representations of treated and control representations.
  • Figure 3: An example causal model graph.
  • Figure 4: Causal model with multiple possible sources of interactions that lead to heterogeneous treatment effects.
  • Figure 5: Causal models where data-driven HTE estimation methods may perform poorly if the structure is unknown. Some variables (e.g., $L$, $M$, and $E$) are not valid adjustment variables for estimating the effect of $T$ on $Y$.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1