Table of Contents
Fetching ...

On the Identifiability of Sparse ICA without Assuming Non-Gaussianity

Ignavier Ng, Yujia Zheng, Xinshuai Dong, Kun Zhang

TL;DR

This work addresses the identifiability of sparse ICA when sources are Gaussian by introducing a connectivity-oriented identifiability framework that relies only on second-order statistics. It defines a novel structural-variability assumption on the mixing-matrix supports, together with a lower-triangular/permutation and faithfulness conditions, to guarantee recovery of the true mixing matrix $\tilde{\mathbf{A}}$ up to signed column permutation via a sparsity-constrained covariance matching objective $\min_{\mathbf{A}} \|\mathbf{A}\|_0$ subject to $\mathbf{A}\mathbf{A}^\top=\tilde{\mathbf{A}}\tilde{\mathbf{A}}^\top$ and structural constraints. The paper provides two estimators—decomposition-based and likelihood-based—both leveraging a regularized objective and a constraint $g(\mathbf{A})=0$ that encodes a permutation-to-lower-triangular form, enabling practical optimization with MCP sparsity and L-BFGS solvers. It also establishes a connection to causal discovery from second-order statistics, showing a mapping between ICA with second-order constraints and linear SEM causal graphs with singleton Markov equivalence classes. Empirical results on synthetic data validate the theory, demonstrating identifiability and improved accuracy over baselines, particularly when the connective-structure assumptions hold, and reveal insights into the role of Gaussian source proportion.

Abstract

Independent component analysis (ICA) is a fundamental statistical tool used to reveal hidden generative processes from observed data. However, traditional ICA approaches struggle with the rotational invariance inherent in Gaussian distributions, often necessitating the assumption of non-Gaussianity in the underlying sources. This may limit their applicability in broader contexts. To accommodate Gaussian sources, we develop an identifiability theory that relies on second-order statistics without imposing further preconditions on the distribution of sources, by introducing novel assumptions on the connective structure from sources to observed variables. Different from recent work that focuses on potentially restrictive connective structures, our proposed assumption of structural variability is both considerably less restrictive and provably necessary. Furthermore, we propose two estimation methods based on second-order statistics and sparsity constraint. Experimental results are provided to validate our identifiability theory and estimation methods.

On the Identifiability of Sparse ICA without Assuming Non-Gaussianity

TL;DR

This work addresses the identifiability of sparse ICA when sources are Gaussian by introducing a connectivity-oriented identifiability framework that relies only on second-order statistics. It defines a novel structural-variability assumption on the mixing-matrix supports, together with a lower-triangular/permutation and faithfulness conditions, to guarantee recovery of the true mixing matrix up to signed column permutation via a sparsity-constrained covariance matching objective subject to and structural constraints. The paper provides two estimators—decomposition-based and likelihood-based—both leveraging a regularized objective and a constraint that encodes a permutation-to-lower-triangular form, enabling practical optimization with MCP sparsity and L-BFGS solvers. It also establishes a connection to causal discovery from second-order statistics, showing a mapping between ICA with second-order constraints and linear SEM causal graphs with singleton Markov equivalence classes. Empirical results on synthetic data validate the theory, demonstrating identifiability and improved accuracy over baselines, particularly when the connective-structure assumptions hold, and reveal insights into the role of Gaussian source proportion.

Abstract

Independent component analysis (ICA) is a fundamental statistical tool used to reveal hidden generative processes from observed data. However, traditional ICA approaches struggle with the rotational invariance inherent in Gaussian distributions, often necessitating the assumption of non-Gaussianity in the underlying sources. This may limit their applicability in broader contexts. To accommodate Gaussian sources, we develop an identifiability theory that relies on second-order statistics without imposing further preconditions on the distribution of sources, by introducing novel assumptions on the connective structure from sources to observed variables. Different from recent work that focuses on potentially restrictive connective structures, our proposed assumption of structural variability is both considerably less restrictive and provably necessary. Furthermore, we propose two estimation methods based on second-order statistics and sparsity constraint. Experimental results are provided to validate our identifiability theory and estimation methods.
Paper Structure (43 sections, 45 theorems, 104 equations, 5 figures, 2 algorithms)

This paper contains 43 sections, 45 theorems, 104 equations, 5 figures, 2 algorithms.

Key Result

Proposition 1

If the true mixing matrix $\tilde{\mathbf{A}}$ does not satisfy Assumption assumption:structural_variability, then there exists a solution $\hat{\mathbf{A}}$ to Problem eq:sparsity_optimization_high_level such that $\hat{\mathbf{A}}\not\sim\tilde{\mathbf{A}}$.

Figures (5)

  • Figure 1: Empirical results of MCC across different sample sizes. Error bars indicate the standard errors calculated based on $10$ random trials.
  • Figure 2: Empirical results of MCC across different ratios of Gaussian sources. Error bars indicate the standard errors calculated based on $10$ random trials.
  • Figure 3: Graphical representations of examples that satisfy Assumption \ref{['assumption:structural_variability']}.
  • Figure 4: Empirical results of Amari distance across different sample sizes. Error bars indicate the standard errors calculated based on $10$ random trials.
  • Figure 5: Empirical results of Amari distance across different ratios of Gaussian sources. Error bars indicate the standard errors calculated based on $10$ random trials.

Theorems & Definitions (84)

  • Definition 1: Connective Structure
  • Definition 2: Covariance Set
  • Example 1: Semialgebraic Constraints
  • Proposition 1
  • Remark 1: Necessary Condition
  • Proposition 2: Dimension of Covariance Set
  • Example 2
  • Proposition 3: Generic Property
  • Theorem 1: Identifiability with Sparsity
  • Theorem 2
  • ...and 74 more