Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data

Mouad El Bouchattaoui; Myriam Tami; Benoit Lepetit; Paul-Henry Cournède

Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data

Mouad El Bouchattaoui, Myriam Tami, Benoit Lepetit, Paul-Henry Cournède

TL;DR

The paper tackles causal effect estimation in longitudinal settings with unobserved static adjustment variables by introducing CDVAE, a causal dynamic variational autoencoder that learns latent substitutes to augment ACATE estimation under sequential ignorability. It establishes identifiability of the augmented treatment effect using a finite-order conditional Markov model and derives a generalization bound linking representation learning, covariate balance, and model loss. The near-deterministic analysis shows that as the decoder variance vanishes, causal estimates become realization-invariant and consistent. Empirical evaluations on synthetic and semi-synthetic (MIMIC-III) data demonstrate that CDVAE outperforms baselines and that augmenting existing models with CDVAE substitutes yields near-oracle performance, highlighting the practical impact for personalized decision-making in medicine and beyond.

Abstract

Accurately estimating treatment effects over time is crucial in fields such as precision medicine, epidemiology, economics, and marketing. Many current methods for estimating treatment effects over time assume that all confounders are observed or attempt to infer unobserved ones. In contrast, our approach focuses on unobserved adjustment variables, which specifically have a causal effect on the outcome sequence. Under the assumption of unconfoundedness, we address estimating Conditional Average Treatment Effects (CATEs) while accounting for unobserved heterogeneity in response to treatment due to these unobserved adjustment variables. Our proposed Causal Dynamic Variational Autoencoder (CDVAE) is grounded in theoretical guarantees concerning the validity of latent adjustment variables and generalization bounds on CATE estimation error. Extensive evaluations on synthetic and real-world datasets show that CDVAE outperforms existing baselines. Moreover, we demonstrate that state-of-the-art models significantly improve their CATE estimates when augmented with the latent substitutes learned by CDVAE, approaching oracle-level performance without direct access to the true adjustment variables.

Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data

TL;DR

Abstract

Paper Structure (82 sections, 14 theorems, 152 equations, 11 figures, 22 tables, 3 algorithms)

This paper contains 82 sections, 14 theorems, 152 equations, 11 figures, 22 tables, 3 algorithms.

Introduction
Assumptions over Confounders and Existing Approaches
Our Focus
Our Approach
Contributions
Related Work
Causal Inference in Time-Varying Settings
Combining Weighting and Representation Learning
Probabilistic Modeling in Causal Inference
Problem Definition
Inference Problem
Causal DVAE
When does a Latent Representation Act as a Valid Substitute for Unobserved Adjustment Variables?
Definition of the Probabilistic Model
Step 1: Handling Selection Bias
...and 67 more sections

Key Result

Theorem 4

Let $\mathbf{Z}$ be a latent variable verifying CMM($p$). Assume the response domain $\mathcal{Y}$ is a Borel subset of a compact interval. Therefore, sequential ignorability holds when augmenting the history process with $\mathbf{Z}$: where $\mathbf{H}_{t}$ represents the history process up to time $t$.

Figures (11)

Figure 1: A simplified representation of the DGP at time $t$. Edges between $Y_{<t}$, $W_{<t}$, and $\mathbf{X}_{<t}$ are omitted for simplicity.
Figure 2: A simplified causal graph for the sketch of the proof for Theorem \ref{['thm:valid_Z']}. We do not represent $W_{\leq t}, \mathbf{X}_{\leq t}$ for simplicity.
Figure 3: Evolution of PEHE in estimating ACATE for synthetic data across increasing levels of heterogeneity induced by adjustment variables $\mathbf{U}$.
Figure 4: Results on the MIMIC-III data reported by PEHE and organized following the three possible configurations. Smaller is better.
Figure 5: Evolution of variance parameter update during training for synthetic data (left) for each level of $\gamma_{(1)}^{YU}$ and MIMIC-III (right) averaged over 10 random initializations
...and 6 more figures

Theorems & Definitions (16)

Remark 1
Theorem 4: Sequential Ignorability with Augmented History
Corollary 5: Identifiability of ACATE with $\mathbf{Z}$
Theorem 6
Theorem 7: Weighted ELBO Decomposition
Theorem 8: Asymptotic Likelihood Recovery
Theorem 9: Realization-Invariant Causal Consistency
Remark 10
Theorem 11: Generalization Bound for Weighted PEHE
Proposition 12: ELBO-Risk Connection
...and 6 more

Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data

TL;DR

Abstract

Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (16)