Differentiable Causal Discovery For Latent Hierarchical Causal Models
Parjanya Prashant, Ignavier Ng, Kun Zhang, Biwei Huang
TL;DR
This work tackles causal discovery when latent confounders form nonlinear, hierarchical structures. It introduces a differentiable framework that learns both the latent graph and the data-generating process by matching the observed distribution with a variational autoencoder while enforcing a structured, block upper-triangular latent graph via Gumbel-softmax. A key theoretical contribution is identifiability of nonlinear latent hierarchical models under relaxed conditions, formalized through a rank Jacobian criterion that links the observed distribution to latent d-separation, plus supporting lemmas and a measurement theorem. Empirically, the method shows improved accuracy and scalability over existing approaches on synthetic graphs and real-world image data (MNIST, CMNIST, CelebA), yielding interpretable hierarchical latent representations that transfer well across domains. The results suggest practical impact for learning interpretable, high-dimensional causal structures in vision and related domains, with avenues for further relaxation of assumptions and broader applicability.
Abstract
Discovering causal structures with latent variables from observational data is a fundamental challenge in causal discovery. Existing methods often rely on constraint-based, iterative discrete searches, limiting their scalability to large numbers of variables. Moreover, these methods frequently assume linearity or invertibility, restricting their applicability to real-world scenarios. We present new theoretical results on the identifiability of nonlinear latent hierarchical causal models, relaxing previous assumptions in literature about the deterministic nature of latent variables and exogenous noise. Building on these insights, we develop a novel differentiable causal discovery algorithm that efficiently estimates the structure of such models. To the best of our knowledge, this is the first work to propose a differentiable causal discovery method for nonlinear latent hierarchical models. Our approach outperforms existing methods in both accuracy and scalability. We demonstrate its practical utility by learning interpretable hierarchical latent structures from high-dimensional image data and demonstrate its effectiveness on downstream tasks.
