Table of Contents
Fetching ...

Identifiable Latent Polynomial Causal Models Through the Lens of Change

Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

TL;DR

This work addresses identifiability in causal representation learning by introducing varying polynomial causal models with exponential-family noise, extending beyond prior linear-Gaussian assumptions. It proves that latent variables and their causal graph are identifiable under $2ℓ+1$ environments, with complete identifiability up to permutation and scaling when all polynomial coefficients change, and offers partial identifiability results when only a subset change. A practical variational framework is proposed to learn these polynomial causal representations, incorporating a DAG-structured prior and two KL terms to promote correct latent structure, with experiments on synthetic data, chemistry-inspired images, and resting-state fMRI validating identifiability and learning efficacy. The approach broadens identifiability theory to nonlinear, non-Gaussian settings and demonstrates that changes in latent causal influences can yield consistent, interpretable latent representations applicable to domain adaptation and generalization. Overall, the paper advances both theory and practice in uncovering high-level latent causality from complex, environment-shifting data.

Abstract

Causal representation learning aims to unveil latent high-level causal representations from observed low-level data. One of its primary tasks is to provide reliable assurance of identifying these latent causal models, known as identifiability. A recent breakthrough explores identifiability by leveraging the change of causal influences among latent causal variables across multiple environments \citep{liu2022identifying}. However, this progress rests on the assumption that the causal relationships among latent causal variables adhere strictly to linear Gaussian models. In this paper, we extend the scope of latent causal models to involve nonlinear causal relationships, represented by polynomial models, and general noise distributions conforming to the exponential family. Additionally, we investigate the necessity of imposing changes on all causal parameters and present partial identifiability results when part of them remains unchanged. Further, we propose a novel empirical estimation method, grounded in our theoretical finding, that enables learning consistent latent causal representations. Our experimental results, obtained from both synthetic and real-world data, validate our theoretical contributions concerning identifiability and consistency.

Identifiable Latent Polynomial Causal Models Through the Lens of Change

TL;DR

This work addresses identifiability in causal representation learning by introducing varying polynomial causal models with exponential-family noise, extending beyond prior linear-Gaussian assumptions. It proves that latent variables and their causal graph are identifiable under environments, with complete identifiability up to permutation and scaling when all polynomial coefficients change, and offers partial identifiability results when only a subset change. A practical variational framework is proposed to learn these polynomial causal representations, incorporating a DAG-structured prior and two KL terms to promote correct latent structure, with experiments on synthetic data, chemistry-inspired images, and resting-state fMRI validating identifiability and learning efficacy. The approach broadens identifiability theory to nonlinear, non-Gaussian settings and demonstrates that changes in latent causal influences can yield consistent, interpretable latent representations applicable to domain adaptation and generalization. Overall, the paper advances both theory and practice in uncovering high-level latent causality from complex, environment-shifting data.

Abstract

Causal representation learning aims to unveil latent high-level causal representations from observed low-level data. One of its primary tasks is to provide reliable assurance of identifying these latent causal models, known as identifiability. A recent breakthrough explores identifiability by leveraging the change of causal influences among latent causal variables across multiple environments \citep{liu2022identifying}. However, this progress rests on the assumption that the causal relationships among latent causal variables adhere strictly to linear Gaussian models. In this paper, we extend the scope of latent causal models to involve nonlinear causal relationships, represented by polynomial models, and general noise distributions conforming to the exponential family. Additionally, we investigate the necessity of imposing changes on all causal parameters and present partial identifiability results when part of them remains unchanged. Further, we propose a novel empirical estimation method, grounded in our theoretical finding, that enables learning consistent latent causal representations. Our experimental results, obtained from both synthetic and real-world data, validate our theoretical contributions concerning identifiability and consistency.
Paper Structure (39 sections, 7 theorems, 29 equations, 13 figures, 2 tables)

This paper contains 39 sections, 7 theorems, 29 equations, 13 figures, 2 tables.

Key Result

Theorem 3.1

Suppose latent causal variables $\mathbf{z}$ and the observed variable $\mathbf{x}$ follow the causal generative models defined in Eq. eq:Generative1 - Eq. eq:poly_generative_add. Assume the following holds: then the true latent causal variables $\mathbf{z}$ are related to the estimated latent causal variables ${ \mathbf{\hat{z}}}$, which are learned by matching the true marginal data distributio

Figures (13)

  • Figure 1: Assume that ground truth is depicted in Figure 1 (a). Due to the transitivity, the graph structure in Figure 1 (b) is an alternative solution for (a), leading to the non-identifiability result. Figure 1 (c) depicts a special structure where two 'pure' child nodes appear. Figure 1 (d) demonstrates the change of the causal influences, characterized by the introduced surrogate variable $\mathbf{u}$.
  • Figure 2: Performances of different methods on linear models with Beta noise, linear models with Gamma noise, and polynomial models with Gaussian noises. In terms of MPC, the proposed method performs better than others, which verifies the proposed identifiablity results. The right subfigure shows the SHD obtained by the proposed method in different model assumptions.
  • Figure 3: Performances of the proposed method with the change of part of weights, on linear models with Beta noise. The ground truth of the causal graph is $z_1\rightarrow z_2\rightarrow z_3\rightarrow z_4$. From left to right: keeping weight on $z_1\rightarrow z_2$, $z_2\rightarrow z_3$, and $z_3\rightarrow z_4$ unchanged. Those results are consistent with the analysis of partial identifiability results in corollary \ref{['corollary:partial']}.
  • Figure 4: Samples from the image dataset generated by modifying the chemistry dataset in ke2021systematic. The colors (states) of the objects change according to the causal graph: the 'diamond' causes the ‘triangle’, and the ‘triangle’ causes the 'square', i.e., $z_1\rightarrow z_2\rightarrow z_3$.
  • Figure 5: MPC obtained by different methods on the image dataset, the proposed method performs better than others, supported by our identifiability.
  • ...and 8 more figures

Theorems & Definitions (7)

  • Theorem 3.1
  • Corollary 3.2
  • Corollary 3.3
  • Theorem A.1
  • Lemma A.2
  • Lemma A.3
  • Lemma A.4