Table of Contents
Fetching ...

Towards Identifiable Latent Additive Noise Models

Yuhang Liu, Zhen Zhang, Dong Gong, Erdun Gao, Biwei Huang, Mingming Gong, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

TL;DR

This paper tackles identifiability in causal representation learning by exploiting changes in latent causal influences across environments via a surrogate variable $\\mathbf{u}$. It develops a general framework based on latent additive noise models (with exponential-family noise) and proves complete identifiability up to permutation and scaling under a lambda constraint, plus a partial identifiability regime when only a subset of influences changes; it further extends these results to latent post-nonlinear models. A practical ELBO-based learning method is proposed to recover latent causal graphs, enforcing a causal order and sparsity to reveal structure, with theoretical guarantees guiding inference. Empirical validation spans synthetic data, semi-synthetic fMRI data, and real human motion datasets, demonstrating accurate latent graphs, meaningful interventions, and superior performance of MLPl-based approaches over polynomial or linear baselines. Collectively, the work broadens identifiable CRL capabilities to more realistic nonlinear and heterogeneous settings, with clear implications for neuroscience and biomechanics datasets where task- or environment-driven shifts occur.

Abstract

Causal representation learning (CRL) offers the promise of uncovering the underlying causal model by which observed data was generated, but the practical applicability of existing methods remains limited by the strong assumptions required for identifiability and by challenges in applying them to real-world settings. Most current approaches are applicable only to relatively restrictive model classes, such as linear or polynomial models, which limits their flexibility and robustness in practice. One promising approach to this problem seeks to address these issues by leveraging changes in causal influences among latent variables. In this vein we propose a more general and relaxed framework than typically applied, formulated by imposing constraints on the function classes applied. Within this framework, we establish partial identifiability results under weaker conditions, including scenarios where only a subset of causal influences change. We then extend our analysis to a broader class of latent post-nonlinear models. Building on these theoretical insights, we develop a flexible method for learning latent causal representations. We demonstrate the effectiveness of our approach on synthetic and semi-synthetic datasets, and further showcase its applicability in a case study on human motion analysis, a complex real-world domain that also highlights the potential to broaden the practical reach of identifiable CRL models.

Towards Identifiable Latent Additive Noise Models

TL;DR

This paper tackles identifiability in causal representation learning by exploiting changes in latent causal influences across environments via a surrogate variable . It develops a general framework based on latent additive noise models (with exponential-family noise) and proves complete identifiability up to permutation and scaling under a lambda constraint, plus a partial identifiability regime when only a subset of influences changes; it further extends these results to latent post-nonlinear models. A practical ELBO-based learning method is proposed to recover latent causal graphs, enforcing a causal order and sparsity to reveal structure, with theoretical guarantees guiding inference. Empirical validation spans synthetic data, semi-synthetic fMRI data, and real human motion datasets, demonstrating accurate latent graphs, meaningful interventions, and superior performance of MLPl-based approaches over polynomial or linear baselines. Collectively, the work broadens identifiable CRL capabilities to more realistic nonlinear and heterogeneous settings, with clear implications for neuroscience and biomechanics datasets where task- or environment-driven shifts occur.

Abstract

Causal representation learning (CRL) offers the promise of uncovering the underlying causal model by which observed data was generated, but the practical applicability of existing methods remains limited by the strong assumptions required for identifiability and by challenges in applying them to real-world settings. Most current approaches are applicable only to relatively restrictive model classes, such as linear or polynomial models, which limits their flexibility and robustness in practice. One promising approach to this problem seeks to address these issues by leveraging changes in causal influences among latent variables. In this vein we propose a more general and relaxed framework than typically applied, formulated by imposing constraints on the function classes applied. Within this framework, we establish partial identifiability results under weaker conditions, including scenarios where only a subset of causal influences change. We then extend our analysis to a broader class of latent post-nonlinear models. Building on these theoretical insights, we develop a flexible method for learning latent causal representations. We demonstrate the effectiveness of our approach on synthetic and semi-synthetic datasets, and further showcase its applicability in a case study on human motion analysis, a complex real-world domain that also highlights the potential to broaden the practical reach of identifiable CRL models.
Paper Structure (45 sections, 11 theorems, 47 equations, 22 figures, 2 tables)

This paper contains 45 sections, 11 theorems, 47 equations, 22 figures, 2 tables.

Key Result

Theorem 3.1

Suppose latent causal variables $\mathbf{z}$ and the observed variable $\mathbf{x}$ follow the causal data generative models defined in Eqs. eq:Generative:n - eq:Generative:x. Assume the following holds: Then each true latent variable $z_i$ is linearly related to exactly one estimated latent variable $\hat{z}_j$, as $z_i = s_j \hat{z}_j + c_i$, for some constants $s_j$ and $c_i$, where all $\hat{

Figures (22)

  • Figure 1: Performance comparison under latent additive Gaussian noise. Left: MPC scores for different methods, where the proposed MLPs method achieves the best performance, supporting our theoretical results. Right: SHD scores of the proposed method, Polynimals liu2023identifiable, and iVAE combined with the method from CDNOD_20.
  • Figure 2: Performance of the proposed method under scenarios where condition \ref{['itm:lambda']} is not satisfied regarding the causal influence of $z_1\rightarrow z_2$ (consequently, $z_2\rightarrow z_3$, and $z_3\rightarrow z_4$). The results are in agreement with partial identifiability in Theorem \ref{['theory: partial']}, i.e., roughly speaking, latent variables that satisfy Condition \ref{['itm:lambda']} are identifiable, while those that do not are not identifiable.
  • Figure 3: (a) MPC scores achieved by different methods. Notably, the proposed MLPs achieve an outstanding average MPC of 0.981, outperforming polynomials (0.977) and linear models (0.965). (b) Recovered latent causal structures using (b1) latent linear models, (b2) latent polynomials, and (b3) latent MLPs. Results for linear models and polynomials are sourced from liu2023identifiable. Blue edges align with known anatomical connectivity, red edges violate anatomical constraints, and green edges are reversed directions.
  • Figure 4: Some sample examples of the data we used.
  • Figure 5: Intervention results on $z_1$ and $z_2$, which demonstrates the causal relationship from elbow to wrist in the right hand.
  • ...and 17 more figures

Theorems & Definitions (26)

  • Theorem 3.1
  • Remark 3.2: Types of Changes in Causal Influences That Facilitate Identifiability
  • Example 3.3
  • Remark 3.4: A Bridge Between Nonlinear ICA and CRL
  • Example 3.5: Implicit Alignment of Environments
  • Theorem 3.6
  • Remark 3.7: Parent nodes do not impact children
  • Remark 3.8: Subspace identifiability
  • Corollary 3.9
  • Corollary 3.10
  • ...and 16 more