Modeling Causal Mechanisms with Diffusion Models for Interventional and Counterfactual Queries

Patrick Chao; Patrick Blöbaum; Sapan Patel; Shiva Prasad Kasiviswanathan

Modeling Causal Mechanisms with Diffusion Models for Interventional and Counterfactual Queries

Patrick Chao, Patrick Blöbaum, Sapan Patel, Shiva Prasad Kasiviswanathan

Abstract

We consider the problem of answering observational, interventional, and counterfactual queries in a causally sufficient setting where only observational data and the causal graph are available. Utilizing the recent developments in diffusion models, we introduce diffusion-based causal models (DCM) to learn causal mechanisms, that generate unique latent encodings. These encodings enable us to directly sample under interventions and perform abduction for counterfactuals. Diffusion models are a natural fit here, since they can encode each node to a latent representation that acts as a proxy for exogenous noise. Our empirical evaluations demonstrate significant improvements over existing state-of-the-art methods for answering causal queries. Furthermore, we provide theoretical results that offer a methodology for analyzing counterfactual estimation in general encoder-decoder models, which could be useful in settings beyond our proposed approach.

Modeling Causal Mechanisms with Diffusion Models for Interventional and Counterfactual Queries

Abstract

Paper Structure (22 sections, 12 theorems, 52 equations, 8 figures, 8 tables, 7 algorithms)

This paper contains 22 sections, 12 theorems, 52 equations, 8 figures, 8 tables, 7 algorithms.

Introduction
Preliminaries
DCMs: Diffusion-based Causal Models
Bounding Counterfactual Error
Extension of Theorem \ref{['thm: cf theory']} to Higher-Dimensional Setting
Experimental Evaluation
Synthetic Data Experiments
Real Data Experiments I
Concluding Remarks
Missing Details from Section \ref{['sec: theory']}
Testing Independence between Parents and Encodings
Missing Experimental Details
Model Hyperparameters
Details about the Additive Noise Model (ANM)
Details about Random Graph Generation
...and 7 more sections

Key Result

Theorem 1

Assume for $X \in \mathcal{X}\subset \mathbb{R}$ and exogenous noise $U\sim \mathrm{Unif}[0,1]$, $X$ satisfies the structural equation: $X:=f(X_{\mathrm{pa}},U)$, where $X_\mathrm{pa} \in \mathcal{X}_\mathrm{pa} \subset \mathbb{R}^d$ are the parents of node $X$ and $U\perp \!\!\! \perp X_\mathrm{pa} Then, $g(X,X_\mathrm{pa}) = \tilde{q} (U)$ for an invertible function $\tilde{q}$.

Figures (8)

Figure 1: Left: Nonlinear setting (NLIN), Right: Nonadditive setting (NADD). Box plots of observational, interventional, and counterfactual queries of the ladder and random SCMs over $20$ random initializations of the model and training data.
Figure 2: Top: Nonlinear setting (NLIN), Bottom: Nonadditive setting (NADD). Box plots of observational, interventional, and counterfactual queries of four different SCMs over $10$ random initializations of the model and training data.
Figure 3: Ladder graph used in Section \ref{['sec: exp']}.
Figure 4: Example of a random graph used in \ref{['sec: exp']}, with exogenous noise nodes omitted for clarity.
Figure 5: Chain graph.
...and 3 more figures

Theorems & Definitions (18)

Theorem 1
Corollary 1
Corollary 2
Theorem 2
Lemma 1
proof
Theorem 2
proof
Corollary 2
proof
...and 8 more

Modeling Causal Mechanisms with Diffusion Models for Interventional and Counterfactual Queries

Abstract

Modeling Causal Mechanisms with Diffusion Models for Interventional and Counterfactual Queries

Authors

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (18)