Identifiability Guarantees for Causal Disentanglement from Purely Observational Data
Ryan Welch, Jiaqi Zhang, Caroline Uhler
TL;DR
This paper addresses identifiability of latent causal factors from purely observational data under a nonlinear additive Gaussian noise model with linear mixing, showing that latent variables are identifiable only up to upstream layers and exogenous noises up to layer-wise transformations. It develops a constructive, score-based approach that reduces to a sequence of quadratically constrained quadratic programs (QCQPs) to recover layer-wise latent representations, followed by layer-wise nonlinear regression to recover noise factors. The theoretical results are complemented by simulations: score-oracle experiments verify the layer-wise identifiability, while practical score-estimation experiments demonstrate robust performance with finite samples. The work provides a principled framework for obtaining meaningful causal representations from observational data, with implications for hierarchical latent structure discovery and potential extensions to additional data modalities or interventions.
Abstract
Causal disentanglement aims to learn about latent causal factors behind data, holding the promise to augment existing representation learning methods in terms of interpretability and extrapolation. Recent advances establish identifiability results assuming that interventions on (single) latent factors are available; however, it remains debatable whether such assumptions are reasonable due to the inherent nature of intervening on latent variables. Accordingly, we reconsider the fundamentals and ask what can be learned using just observational data. We provide a precise characterization of latent factors that can be identified in nonlinear causal models with additive Gaussian noise and linear mixing, without any interventions or graphical restrictions. In particular, we show that the causal variables can be identified up to a layer-wise transformation and that further disentanglement is not possible. We transform these theoretical results into a practical algorithm consisting of solving a quadratic program over the score estimation of the observed data. We provide simulation results to support our theoretical guarantees and demonstrate that our algorithm can derive meaningful causal representations from purely observational data.
