Table of Contents
Fetching ...

Counterfactuals uncover the modular structure of deep generative models

Michel Besserve, Arash Mehrjou, Rémy Sun, Bernhard Schölkopf

TL;DR

The paper introduces a causal framework (CGM) to study modularity in deep generative models and to move beyond statistical disentanglement by employing counterfactual interventions on internal representations. It defines intrinsic disentanglement and modularity, then uses counterfactual hybridization and influence maps to detect and quantify decorrelated internal modules. Through experiments on CelebA with DCGAN, β-VAE, BEGAN and on ImageNet with BigGAN, the authors demonstrate interpretable modules (e.g., background, hair, facial features) and show that targeted interventions produce coherent counterfactuals while preserving image realism. The work enables controllable, efficient transformations (e.g., style transfer) and provides a diagnostic tool for assessing robustness to contextual changes in recognition systems.

Abstract

Deep generative models can emulate the perceptual properties of complex image datasets, providing a latent representation of the data. However, manipulating such representation to perform meaningful and controllable transformations in the data space remains challenging without some form of supervision. While previous work has focused on exploiting statistical independence to disentangle latent factors, we argue that such requirement is too restrictive and propose instead a non-statistical framework that relies on counterfactual manipulations to uncover a modular structure of the network composed of disentangled groups of internal variables. Experiments with a variety of generative models trained on complex image datasets show the obtained modules can be used to design targeted interventions. This opens the way to applications such as computationally efficient style transfer and the automated assessment of robustness to contextual changes in pattern recognition systems.

Counterfactuals uncover the modular structure of deep generative models

TL;DR

The paper introduces a causal framework (CGM) to study modularity in deep generative models and to move beyond statistical disentanglement by employing counterfactual interventions on internal representations. It defines intrinsic disentanglement and modularity, then uses counterfactual hybridization and influence maps to detect and quantify decorrelated internal modules. Through experiments on CelebA with DCGAN, β-VAE, BEGAN and on ImageNet with BigGAN, the authors demonstrate interpretable modules (e.g., background, hair, facial features) and show that targeted interventions produce coherent counterfactuals while preserving image realism. The work enables controllable, efficient transformations (e.g., style transfer) and provides a diagnostic tool for assessing robustness to contextual changes in recognition systems.

Abstract

Deep generative models can emulate the perceptual properties of complex image datasets, providing a latent representation of the data. However, manipulating such representation to perform meaningful and controllable transformations in the data space remains challenging without some form of supervision. While previous work has focused on exploiting statistical independence to disentangle latent factors, we argue that such requirement is too restrictive and propose instead a non-statistical framework that relies on counterfactual manipulations to uncover a modular structure of the network composed of disentangled groups of internal variables. Experiments with a variety of generative models trained on complex image datasets show the obtained modules can be used to design targeted interventions. This opens the way to applications such as computationally efficient style transfer and the automated assessment of robustness to contextual changes in pattern recognition systems.

Paper Structure

This paper contains 13 sections, 5 theorems, 17 equations, 18 figures, 2 tables.

Key Result

Proposition 1

Consider an intervention on subset $\mathcal{E}$, its associated counterfactual mapping is faithful if and only if it is disentangled. For interventions on variables that remain within the support of the original marginal distribution, it is sufficient that $\mathcal{E}$ and its complement $\mkern 1

Figures (18)

  • Figure 1: Counterfactual manipulation of samples of a BigGan trained on ImageNet (see section \ref{['sec:biggan']}).
  • Figure 2: (a) Illustration of the generative mapping and a disentangled transformation. (b) Causal graph of an example CGM showing different types of independence between nodes. (c) Commutative diagram showing sparse transformation $T'$ in latent space associated to a disentangled transformation $T$. (d) Illustration of intrinsic disentanglement with $\mathcal{E}=\{2\}$.
  • Figure 3: Generation of influence maps. (a) Principle of sample hybridization through counterfactuals. (b) Left: Clustering of influence maps for a BEGAN trained on the CelebA dataset (see text). Center: example EIM of each cluster. Right: samples of the hybridization procedure using as module all channels of the intermediate layer belonging to the cluster of corresponding row. See Fig. \ref{['fig:extraBEGANExples']} for additional samples.
  • Figure 3: The classification outcome of several discriminative models for three randomly chosen koala+teddy hybrids (see Figure. \ref{['fig:classifier_inputs']}). The purpose of this experiment is to investigate the use of the proposed intervention procedure for assessing robustness of classifiers. As can be seen in the following images, the resultant hybrids are roughly a teddy bear in a koala context. An ideal classifier must be sensitive to the object present in the scene not the contextual information. A teddy bear must still be classified as a teddy bear even if it appears on a tree which is the koala environment in most of the koala images in the ImageNet dataset. It can be seen in the following table that nasnet_large is more robust to the change of context compared to other classifiers.
  • Figure 4: Examples of BigGAN hybridizations across classes. Left: ostrich-cock, right: koala-teddy. See Figs. \ref{['fig:extraBIGGANExples']}-\ref{['fig:extraBIGGANExplesOstRoost']} for additional samples and Fig. \ref{['fig:entropyBIGGAN']}-\ref{['fig:entropyBIGGANOstRoost']} for entropy analysis.
  • ...and 13 more figures

Theorems & Definitions (15)

  • Definition 1: Unit level counterfactual, informal
  • Definition 2: Counterfactual mapping
  • Definition 3: Intrinsic disentanglement, informal
  • Proposition 1: Counterfactuals and disentanglement, informal
  • Definition 4: Modularity
  • Proposition 2: Modularity implies disentanglement
  • Proposition 3
  • Definition 5: Causal Generative Model (CGM)
  • Definition 6: Embedded CGMs
  • Proposition 4
  • ...and 5 more