Table of Contents
Fetching ...

Disentangled Representation Learning for Causal Inference with Instruments

Debo Cheng, Jiuyong Li, Lin Liu, Ziqi Xu, Weijia Zhang, Jixue Liu, Thuc Duy Le

TL;DR

A variational autoencoder (VAE)-based disentangled representation learning method to learn an IV representation from a dataset with latent confounders and then utilize the IV representation to obtain an unbiased estimation of the causal effect from the data.

Abstract

Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to address this challenge. Existing IV based estimators need a known IV or other strong assumptions, such as the existence of two or more IVs in the system, which limits the application of the IV approach. In this paper, we consider a relaxed requirement, which assumes there is an IV proxy in the system without knowing which variable is the proxy. We propose a Variational AutoEncoder (VAE) based disentangled representation learning method to learn an IV representation from a dataset with latent confounders and then utilise the IV representation to obtain an unbiased estimation of the causal effect from the data. Extensive experiments on synthetic and real-world data have demonstrated that the proposed algorithm outperforms the existing IV based estimators and VAE-based estimators.

Disentangled Representation Learning for Causal Inference with Instruments

TL;DR

A variational autoencoder (VAE)-based disentangled representation learning method to learn an IV representation from a dataset with latent confounders and then utilize the IV representation to obtain an unbiased estimation of the causal effect from the data.

Abstract

Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to address this challenge. Existing IV based estimators need a known IV or other strong assumptions, such as the existence of two or more IVs in the system, which limits the application of the IV approach. In this paper, we consider a relaxed requirement, which assumes there is an IV proxy in the system without knowing which variable is the proxy. We propose a Variational AutoEncoder (VAE) based disentangled representation learning method to learn an IV representation from a dataset with latent confounders and then utilise the IV representation to obtain an unbiased estimation of the causal effect from the data. Extensive experiments on synthetic and real-world data have demonstrated that the proposed algorithm outperforms the existing IV based estimators and VAE-based estimators.

Paper Structure

This paper contains 22 sections, 1 theorem, 8 equations, 13 figures, 5 tables.

Key Result

Theorem 1

Given a joint distribution $P(\mathbf{X}, W, Y)$ generated from a causal DAG $\mathcal{G}\!\!=\!\!(\mathbf{X}\cup \mathbf{U}\cup \{W, Y\}, \mathbf{E})$. $\mathcal{G}$ contains $W\rightarrow Y$ and $W\leftarrow U'\rightarrow Y$ in $\mathcal{G}$, and $\forall X\in\mathbf{X}$, $X\notin De(W\cup Y)$ in

Figures (13)

  • Figure 1: Causal graphs with latent variables to show the problem of causal effect estimation from observational data. In DAG (a), the causal effect of $W$ on $Y$ cannot be estimated from observational data; in DAG (b), there is a valid IV $Z$; in DAG (c), $Z$ is an unmeasured IV and $S$ is a surrogate IV (SIV) of $Z$. The causal effect of $W$ on $Y$ in both DAGs (b) and (c) can be recovered from observational data.
  • Figure 2: An example causal DAG representing the data generation mechanism. The shaded area indicates all the measured pretreatment variables, and among them, $\mathbf{S}$ is a set of SIVs, $\mathbf{Z}$ is a set of latent IVs and $U$ is a latent confounder affecting both $W$ and $Y$.
  • Figure 3: The disentanglement scheme of DIV.VAE, represented as a causal graph. The dotted arrows indicate possible ancestral relationships between nodes. $W$, $Y$ and $U'$ are the treatment variable, the outcome and the latent confounder of $W$ and $Y$, respectively. $\mathbf{X}$ is the set of measured pretreatment variables and contains at least one SIV. $\mathbf{\Phi}=(\mathbf{Z}, \mathbf{C})$ is the latent representation of $\mathbf{X}$, where $\mathbf{Z}$ and $\mathbf{C}$ are the sets of disentangled IV representation and confounding representation, respectively.
  • Figure 4: DIV.VAE architecture. The input $\mathbf{X}$ is encoded by $q_{\phi_{\mathbf{Z}}}(\mathbf{Z}\mid\mathbf{X})$ and $q_{\phi_{\mathbf{C}}}(\mathbf{C}\mid\mathbf{X})$ into the parameters of the latent representation. The middle dashed box is the orthogonality promoting regularisation (OPR) for ensuring $\mathbf{Z}\perp\!\!\!\perp\mathbf{C}$. Samples are drawn from each of the latent representations using the reparameterised trick. The samples are then concatenated and decoded through $p_{\theta_{\mathbf{x}}}(\mathbf{X}\mid\mathbf{Z}, \mathbf{C})$. The two grey boxes indicate the two auxiliary predictors $q_{\varphi_{W}}(W\mid\mathbf{Z},\mathbf{C})$ and $q_{\varphi_{Y}}(Y\mid W, \mathbf{C})$.
  • Figure 5: The true causal DAG with a latent confounder $U$ between $W$ and $Y$ is used to generate the synthetic datasets. $Z$ and $S$ are a latent IV and an SIV, respectively. $\{U_1, U_2\}$ are two latent variables, and other measured variables are pretreatment variables of $(W, Y)$.
  • ...and 8 more figures

Theorems & Definitions (7)

  • Definition 1: Markov property pearl2009causality
  • Definition 2: Faithfulness spirtes2000causation
  • Definition 3: d-separation pearl2009causality
  • Definition 4: Back-door criterion pearl2009causality
  • Definition 5: Surrogate Instrumental Variable (SIV)
  • Theorem 1
  • proof