On the Identifiability of Nonlinear ICA: Sparsity and Beyond
Yujia Zheng, Ignavier Ng, Kun Zhang
TL;DR
This work tackles the identifiability challenge in nonlinear ICA without auxiliary variables by imposing a structural sparsity constraint on the mixing function’s Jacobian. It proves that, under invertible and smooth mixing with the sparsity pattern, the latent sources can be recovered up to a permutation and a component-wise invertible transformation, and it further shows how rotation indeterminacy can be mitigated within this framework. The authors provide a regularized estimation approach and validate the theory with synthetic experiments and image-based data, suggesting practical relevance for unconditional-prior identifiability. Taken together, the results extend identifiable nonlinear ICA to unconditional priors and undercomplete settings, offering a principled alternative to weak supervision strategies.
Abstract
Nonlinear independent component analysis (ICA) aims to recover the underlying independent latent sources from their observable nonlinear mixtures. How to make the nonlinear ICA model identifiable up to certain trivial indeterminacies is a long-standing problem in unsupervised learning. Recent breakthroughs reformulate the standard independence assumption of sources as conditional independence given some auxiliary variables (e.g., class labels and/or domain/time indexes) as weak supervision or inductive bias. However, nonlinear ICA with unconditional priors cannot benefit from such developments. We explore an alternative path and consider only assumptions on the mixing process, such as Structural Sparsity. We show that under specific instantiations of such constraints, the independent latent sources can be identified from their nonlinear mixtures up to a permutation and a component-wise transformation, thus achieving nontrivial identifiability of nonlinear ICA without auxiliary variables. We provide estimation methods and validate the theoretical results experimentally. The results on image data suggest that our conditions may hold in a number of practical data generating processes.
