Table of Contents
Fetching ...

CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training

Murat Kocaoglu, Christopher Snyder, Alexandros G. Dimakis, Sriram Vishwanath

TL;DR

This work introduces causal implicit generative models (CiGM) and two graph-aware GAN architectures (CausalGAN and CausalBEGAN) to enable sampling from both observational and interventional image distributions conditioned on a causal graph over binary labels. By structuring generators to reflect causal relationships and using auxiliary networks (Labeler and Anti-Labeler), the authors prove that the optimal generators reproduce class-conditional distributions and extend guarantees to multi-label settings. A two-stage training pipeline—first learning label distributions with a causal controller and then learning image generation conditioned on those labels—allows principled interventional sampling (do-operator) that can produce novel samples not present in the dataset, demonstrated on CelebA. The results show that the approach yields high-quality, label-consistent images under conditioning and intervention and highlight the importance of causal structure in generative modeling for controllable and diverse sample generation.

Abstract

We propose an adversarial training procedure for learning a causal implicit generative model for a given causal graph. We show that adversarial training can be used to learn a generative model with true observational and interventional distributions if the generator architecture is consistent with the given causal graph. We consider the application of generating faces based on given binary labels where the dependency structure between the labels is preserved with a causal graph. This problem can be seen as learning a causal implicit generative model for the image and labels. We devise a two-stage procedure for this problem. First we train a causal implicit generative model over binary labels using a neural network consistent with a causal graph as the generator. We empirically show that WassersteinGAN can be used to output discrete labels. Later, we propose two new conditional GAN architectures, which we call CausalGAN and CausalBEGAN. We show that the optimal generator of the CausalGAN, given the labels, samples from the image distributions conditioned on these labels. The conditional GAN combined with a trained causal implicit generative model for the labels is then a causal implicit generative model over the labels and the generated image. We show that the proposed architectures can be used to sample from observational and interventional image distributions, even for interventions which do not naturally occur in the dataset.

CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training

TL;DR

This work introduces causal implicit generative models (CiGM) and two graph-aware GAN architectures (CausalGAN and CausalBEGAN) to enable sampling from both observational and interventional image distributions conditioned on a causal graph over binary labels. By structuring generators to reflect causal relationships and using auxiliary networks (Labeler and Anti-Labeler), the authors prove that the optimal generators reproduce class-conditional distributions and extend guarantees to multi-label settings. A two-stage training pipeline—first learning label distributions with a causal controller and then learning image generation conditioned on those labels—allows principled interventional sampling (do-operator) that can produce novel samples not present in the dataset, demonstrated on CelebA. The results show that the approach yields high-quality, label-consistent images under conditioning and intervention and highlight the importance of causal structure in generative modeling for controllable and diverse sample generation.

Abstract

We propose an adversarial training procedure for learning a causal implicit generative model for a given causal graph. We show that adversarial training can be used to learn a generative model with true observational and interventional distributions if the generator architecture is consistent with the given causal graph. We consider the application of generating faces based on given binary labels where the dependency structure between the labels is preserved with a causal graph. This problem can be seen as learning a causal implicit generative model for the image and labels. We devise a two-stage procedure for this problem. First we train a causal implicit generative model over binary labels using a neural network consistent with a causal graph as the generator. We empirically show that WassersteinGAN can be used to output discrete labels. Later, we propose two new conditional GAN architectures, which we call CausalGAN and CausalBEGAN. We show that the optimal generator of the CausalGAN, given the labels, samples from the image distributions conditioned on these labels. The conditional GAN combined with a trained causal implicit generative model for the labels is then a causal implicit generative model over the labels and the generated image. We show that the proposed architectures can be used to sample from observational and interventional image distributions, even for interventions which do not naturally occur in the dataset.

Paper Structure

This paper contains 33 sections, 13 theorems, 33 equations, 24 figures, 2 tables.

Key Result

Theorem 1

Let $G(l,z)$ be the output of the generator for a given label $l$ and latent vector $z$. Let $G^*$ be the global optimal generator for the loss function in (eq:gen_loss), when the rest of the network is trained to optimality. Then the generator samples from the conditional image distribution given t

Figures (24)

  • Figure 1: Observational and interventional image samples from CausalBEGAN. Our architecture can be used to sample not only from the joint distribution (conditioned on a label) but also from the interventional distribution, e.g., under the intervention do$(Mustache=1)$. The resulting distributions are clearly different, as is evident from the samples outside the dataset, e.g., females with mustaches.
  • Figure 2: (a) The causal graph implied by the standard generator architecture, feedforward neural network. (b) A neural network implementation of the causal graph $X\rightarrow Z \leftarrow Y$: Each feed forward neural net captures the function $f$ in the structural equation model $V = f(Pa_V, E)$.
  • Figure 3: A plausible causal model for image generation.
  • Figure 4: CausalGAN architecture.
  • Figure 5: The causal graph used for simulations for both CausalGAN and CausalBEGAN, called Causal Graph 1 (G1). We also add edges (see Appendix Section \ref{['app:cc']}) to form the complete graph "cG1". We also make use of the graph rcG1, which is obtained by reversing the direction of every edge in cG1.
  • ...and 19 more figures

Theorems & Definitions (24)

  • Theorem 1: Informal
  • Corollary 1
  • Proposition 1
  • proof
  • Definition 1
  • Proposition 2: Goodfellow2014
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • ...and 14 more