A Sparsity Principle for Partially Observable Causal Representation Learning
Danru Xu, Dingling Yao, Sébastien Lachapelle, Perouz Taslakian, Julius von Kügelgen, Francesco Locatello, Sara Magliacane
TL;DR
This work tackles causal representation learning when observations are partially informative about latent variables, introducing an Unpaired Partial Observation framework. It proves identifiability under a sparsity principle for both linear and piecewise-linear mixing: linear mixing yields exact recovery up to permutation and diagonal scaling with a zero-reconstruction constraint, while piecewise-linear mixing with Gaussian latents and group-aware Gaussianity constraints achieves the same identifiability. It then implements two estimation methods leveraging these theories, substituting $\ell_0$ sparsity with $\ell_1$ penalties and adding Gaussianity regularizers, and validates them across numerical simulations and image-based benchmarks (e.g., Multiple Balls and PartialCausal3DIdent). The results show robust latent-recovery performance under varying partial observability patterns, demonstrating practical potential for robust, interpretable CRL in settings with occlusions or missing data. Limitations include reliance on known groupings and Gaussianity in the nonlinear setting, motivating future work to extend identifiability to broader nonlinear regimes and weaker observability assumptions.
Abstract
Causal representation learning aims at identifying high-level causal variables from perceptual data. Most methods assume that all latent causal variables are captured in the high-dimensional observations. We instead consider a partially observed setting, in which each measurement only provides information about a subset of the underlying causal state. Prior work has studied this setting with multiple domains or views, each depending on a fixed subset of latents. Here, we focus on learning from unpaired observations from a dataset with an instance-dependent partial observability pattern. Our main contribution is to establish two identifiability results for this setting: one for linear mixing functions without parametric assumptions on the underlying causal model, and one for piecewise linear mixing functions with Gaussian latent causal variables. Based on these insights, we propose two methods for estimating the underlying causal variables by enforcing sparsity in the inferred representation. Experiments on different simulated datasets and established benchmarks highlight the effectiveness of our approach in recovering the ground-truth latents.
