Table of Contents
Fetching ...

Identifiability Guarantees for Causal Disentanglement from Purely Observational Data

Ryan Welch, Jiaqi Zhang, Caroline Uhler

TL;DR

This paper addresses identifiability of latent causal factors from purely observational data under a nonlinear additive Gaussian noise model with linear mixing, showing that latent variables are identifiable only up to upstream layers and exogenous noises up to layer-wise transformations. It develops a constructive, score-based approach that reduces to a sequence of quadratically constrained quadratic programs (QCQPs) to recover layer-wise latent representations, followed by layer-wise nonlinear regression to recover noise factors. The theoretical results are complemented by simulations: score-oracle experiments verify the layer-wise identifiability, while practical score-estimation experiments demonstrate robust performance with finite samples. The work provides a principled framework for obtaining meaningful causal representations from observational data, with implications for hierarchical latent structure discovery and potential extensions to additional data modalities or interventions.

Abstract

Causal disentanglement aims to learn about latent causal factors behind data, holding the promise to augment existing representation learning methods in terms of interpretability and extrapolation. Recent advances establish identifiability results assuming that interventions on (single) latent factors are available; however, it remains debatable whether such assumptions are reasonable due to the inherent nature of intervening on latent variables. Accordingly, we reconsider the fundamentals and ask what can be learned using just observational data. We provide a precise characterization of latent factors that can be identified in nonlinear causal models with additive Gaussian noise and linear mixing, without any interventions or graphical restrictions. In particular, we show that the causal variables can be identified up to a layer-wise transformation and that further disentanglement is not possible. We transform these theoretical results into a practical algorithm consisting of solving a quadratic program over the score estimation of the observed data. We provide simulation results to support our theoretical guarantees and demonstrate that our algorithm can derive meaningful causal representations from purely observational data.

Identifiability Guarantees for Causal Disentanglement from Purely Observational Data

TL;DR

This paper addresses identifiability of latent causal factors from purely observational data under a nonlinear additive Gaussian noise model with linear mixing, showing that latent variables are identifiable only up to upstream layers and exogenous noises up to layer-wise transformations. It develops a constructive, score-based approach that reduces to a sequence of quadratically constrained quadratic programs (QCQPs) to recover layer-wise latent representations, followed by layer-wise nonlinear regression to recover noise factors. The theoretical results are complemented by simulations: score-oracle experiments verify the layer-wise identifiability, while practical score-estimation experiments demonstrate robust performance with finite samples. The work provides a principled framework for obtaining meaningful causal representations from observational data, with implications for hierarchical latent structure discovery and potential extensions to additional data modalities or interventions.

Abstract

Causal disentanglement aims to learn about latent causal factors behind data, holding the promise to augment existing representation learning methods in terms of interpretability and extrapolation. Recent advances establish identifiability results assuming that interventions on (single) latent factors are available; however, it remains debatable whether such assumptions are reasonable due to the inherent nature of intervening on latent variables. Accordingly, we reconsider the fundamentals and ask what can be learned using just observational data. We provide a precise characterization of latent factors that can be identified in nonlinear causal models with additive Gaussian noise and linear mixing, without any interventions or graphical restrictions. In particular, we show that the causal variables can be identified up to a layer-wise transformation and that further disentanglement is not possible. We transform these theoretical results into a practical algorithm consisting of solving a quadratic program over the score estimation of the observed data. We provide simulation results to support our theoretical guarantees and demonstrate that our algorithm can derive meaningful causal representations from purely observational data.

Paper Structure

This paper contains 27 sections, 7 theorems, 36 equations, 4 figures, 2 tables, 2 algorithms.

Key Result

Lemma 1

Similar results have been used in varici2024generalvarici2024score, where varici2024score provided formulas for general mixings. Under Assumption ass:1, the score functions and associated Jacobian matrices over $X$ and $Z$ are related via the following transformations:

Figures (4)

  • Figure 1: The considered data-generating process. The latent variables $Z$ follow a nonlinear causal model with additive Gaussian noises. We observe them after an unknown linear mixing (gray edges).
  • Figure 2: Mean absolute correlation of the exogenous noise estimates using score estimations.
  • Figure 3: Score oracle simulations. (A) Estimated versus true latent variables on the line graph. (B) Estimated versus true latent variables on the Y-structure. (C) Estimated versus true exogenous variables on the line graph. (D) Estimated versus true exogenous variables on the Y-structure.
  • Figure 4: Mean absolute correlation (MAC) of $\mathcal{E}$ estimations v.s. Signal-to-error ratio (SER) of Jacobian matrices.

Theorems & Definitions (15)

  • Definition 1: Identifiability up to upstream layers
  • Definition 2: Identifiability up to layers
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Theorem 1
  • Theorem 2
  • Proposition 1
  • proof
  • Lemma 4
  • ...and 5 more