Bayesian ICA with super-Gaussian Source Priors
Jyotishka Datta, Soham Ghosh, Nicholas G. Polson
TL;DR
This work develops a fully Bayesian framework for independent component analysis with super‑Gaussian sources by introducing horseshoe‑type priors via a Polya–Gamma scale‑mixture representation. The authors unify MAP estimation and full Bayesian posterior inference through a conjugate Gibbs sampler (Gibbs‑ICE) and exactly characterize posterior contraction and a Bernstein–von Mises limit for the unmixing matrix up to signed permutations. They prove a uniform LAN expansion around the true unmixing matrix, establish parametric $N^{-1/2}$ contraction in the $d_{\pm}$ metric, and demonstrate competitive performance against leading ICA methods across several source distributions. Additional theory covers envelope optimization, auxiliary‑function EM, and connections to nonlinear ICA and flow‑based models, while simulations validate accuracy in source recovery and reconstruction under various noise regimes. The work thus provides a principled Bayesian treatment of ICA with scalable computation and solid asymptotic guarantees, with implications for semiparametric extensions and nonlinear feature extraction.
Abstract
Independent Component Analysis (ICA) plays a central role in modern machine learning as a flexible framework for feature extraction. We introduce a horseshoe-type prior with a latent Polya-Gamma scale mixture representation, yielding scalable algorithms for both point estimation via expectation-maximization (EM) and full posterior inference via Markov chain Monte Carlo (MCMC). This hierarchical formulation unifies several previously disparate estimation strategies within a single Bayesian framework. We also establish the first theoretical guarantees for hierarchical Bayesian ICA, including posterior contraction and local asymptotic normality results for the unmixing matrix. Comprehensive simulation studies demonstrate that our methods perform competitively with widely used ICA tools. We further discuss implementation of conditional posteriors, envelope-based optimization, and possible extensions to flow-based architectures for nonlinear feature extraction and deep learning. Finally, we outline several promising directions for future work.
