Table of Contents
Fetching ...

Estimating Individual Dose-Response Curves under Unobserved Confounders from Observational Data

Shutong Chen, Yang Li

TL;DR

ContiVAE is presented, a novel framework for estimating causal effects of continuous treatments, measured by individual dose-response curves, considering the presence of unobserved confounders using observational data, and is able to predict the potential outcome of any treatment level for each individual while effectively capture the heterogeneity among individuals.

Abstract

Estimating an individual's potential response to continuously varied treatments is crucial for addressing causal questions across diverse domains, from healthcare to social sciences. However, existing methods are limited either to estimating causal effects of binary treatments, or scenarios where all confounding variables are measurable. In this work, we present ContiVAE, a novel framework for estimating causal effects of continuous treatments, measured by individual dose-response curves, considering the presence of unobserved confounders using observational data. Leveraging a variational auto-encoder with a Tilted Gaussian prior distribution, ContiVAE models the hidden confounders as latent variables, and is able to predict the potential outcome of any treatment level for each individual while effectively capture the heterogeneity among individuals. Experiments on semi-synthetic datasets show that ContiVAE outperforms existing methods by up to 62%, demonstrating its robustness and flexibility. Application on a real-world dataset illustrates its practical utility.

Estimating Individual Dose-Response Curves under Unobserved Confounders from Observational Data

TL;DR

ContiVAE is presented, a novel framework for estimating causal effects of continuous treatments, measured by individual dose-response curves, considering the presence of unobserved confounders using observational data, and is able to predict the potential outcome of any treatment level for each individual while effectively capture the heterogeneity among individuals.

Abstract

Estimating an individual's potential response to continuously varied treatments is crucial for addressing causal questions across diverse domains, from healthcare to social sciences. However, existing methods are limited either to estimating causal effects of binary treatments, or scenarios where all confounding variables are measurable. In this work, we present ContiVAE, a novel framework for estimating causal effects of continuous treatments, measured by individual dose-response curves, considering the presence of unobserved confounders using observational data. Leveraging a variational auto-encoder with a Tilted Gaussian prior distribution, ContiVAE models the hidden confounders as latent variables, and is able to predict the potential outcome of any treatment level for each individual while effectively capture the heterogeneity among individuals. Experiments on semi-synthetic datasets show that ContiVAE outperforms existing methods by up to 62%, demonstrating its robustness and flexibility. Application on a real-world dataset illustrates its practical utility.

Paper Structure

This paper contains 28 sections, 1 theorem, 15 equations, 7 figures, 5 tables.

Key Result

theorem thmcountertheorem

If $p(\mathbf{X}, \mathbf{Z}, T, Y)$ is recovered, $\mathbb{E}[Y(t)|\mathbf{X} = \mathbf{x}]$ is identifiable from observational data under the causal model in Figure fig: causal_graph.

Figures (7)

  • Figure 1: An example of individual dose-response curves. Different colors denote curves of patients with various covariates $\mathbf{X}$ (e.g., age or gender).
  • Figure 2: The causal graph model. $T$ is a treatment, $Y$ is an outcome, $\mathbf{Z}$ is the unobserved confounders, and $\mathbf{X}$ is measurable covariates affected by $\mathbf{Z}$.
  • Figure 3: The model architecture of ContiVAE. Blue rectangles represent MLP networks with $h$ units per hidden layer, while green rectangles denote output layers. Circles represent probability distributions.
  • Figure 4: An example floto2022tilted of the Tilted Gaussian distribution in 2D, $\tau = 3$. The region of maximum density lies on the surface with a radius of 3.
  • Figure 5: Performance of our model on Curve 4, News dataset with different settings of $\lambda$ in the loss function, the number of units in each hidden layer and the dimension of latent variable $Z$. The blue line and y-axis on the left represents $\sqrt{\text{MISE}}$, and the green line and y-axis on the right represents $\sqrt{\text{DPE}}$.
  • ...and 2 more figures

Theorems & Definitions (2)

  • theorem thmcountertheorem
  • proof