Counterfactuals uncover the modular structure of deep generative models

Michel Besserve; Arash Mehrjou; Rémy Sun; Bernhard Schölkopf

Counterfactuals uncover the modular structure of deep generative models

Michel Besserve, Arash Mehrjou, Rémy Sun, Bernhard Schölkopf

TL;DR

The paper introduces a causal framework (CGM) to study modularity in deep generative models and to move beyond statistical disentanglement by employing counterfactual interventions on internal representations. It defines intrinsic disentanglement and modularity, then uses counterfactual hybridization and influence maps to detect and quantify decorrelated internal modules. Through experiments on CelebA with DCGAN, β-VAE, BEGAN and on ImageNet with BigGAN, the authors demonstrate interpretable modules (e.g., background, hair, facial features) and show that targeted interventions produce coherent counterfactuals while preserving image realism. The work enables controllable, efficient transformations (e.g., style transfer) and provides a diagnostic tool for assessing robustness to contextual changes in recognition systems.

Abstract

Deep generative models can emulate the perceptual properties of complex image datasets, providing a latent representation of the data. However, manipulating such representation to perform meaningful and controllable transformations in the data space remains challenging without some form of supervision. While previous work has focused on exploiting statistical independence to disentangle latent factors, we argue that such requirement is too restrictive and propose instead a non-statistical framework that relies on counterfactual manipulations to uncover a modular structure of the network composed of disentangled groups of internal variables. Experiments with a variety of generative models trained on complex image datasets show the obtained modules can be used to design targeted interventions. This opens the way to applications such as computationally efficient style transfer and the automated assessment of robustness to contextual changes in pattern recognition systems.

Counterfactuals uncover the modular structure of deep generative models

TL;DR

Abstract

Counterfactuals uncover the modular structure of deep generative models

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (18)

Theorems & Definitions (15)