Table of Contents
Fetching ...

Nonlinearity, Feedback and Uniform Consistency in Causal Structural Learning

Shuyan Wang

TL;DR

This work advances causal discovery by generalizing the k-Triangle Faithfulness to smooth and non-Gaussian settings, establishing uniform consistency for both structure and causal-effect estimation under relaxed assumptions, and extending latent-variable learning to handle cycles and nonlinearity. It introduces the VCSGS algorithm and a nonparametric edge-estimation framework, grounded in total variation smoothness and NZ conditions, to achieve uniformly consistent structure inference. The latent-structure portion combines GIN, rank constraints, and tensor constraints to identify cycles within latent clusters and across latent blocks, including both Gaussian and non-Gaussian noise regimes, with concrete algorithms and experiments. Collectively, the thesis broadens causal discovery applicability to systems with latent constructs and nonlinear interactions, offering principled probabilistic and algebraic tools for robust causal inference in richer data-generating processes.

Abstract

The goal of Causal Discovery is to find automated search methods for learning causal structures from observational data. In some cases all variables of the interested causal mechanism are measured, and the task is to predict the effects one measured variable has on another. In contrast, sometimes the variables of primary interest are not directly observable but instead inferred from their manifestations in the data. These are referred to as latent variables. One commonly known example is the psychological construct of intelligence, which cannot directly measured so researchers try to assess through various indicators such as IQ tests. In this case, casual discovery algorithms can uncover underlying patterns and structures to reveal the causal connections between the latent variables and between the latent and observed variables. This thesis focuses on two questions in causal discovery: providing an alternative definition of k-Triangle Faithfulness that (i) is weaker than strong faithfulness when applied to the Gaussian family of distributions, (ii) can be applied to non-Gaussian families of distributions, and (iii) under the assumption that the modified version of Strong Faithfulness holds, can be used to show the uniform consistency of a modified causal discovery algorithm; relaxing the sufficiency assumption to learn causal structures with latent variables. Given the importance of inferring cause-and-effect relationships for understanding and forecasting complex systems, the work in this thesis of relaxing various simplification assumptions is expected to extend the causal discovery method to be applicable in a wider range with diversified causal mechanism and statistical phenomena.

Nonlinearity, Feedback and Uniform Consistency in Causal Structural Learning

TL;DR

This work advances causal discovery by generalizing the k-Triangle Faithfulness to smooth and non-Gaussian settings, establishing uniform consistency for both structure and causal-effect estimation under relaxed assumptions, and extending latent-variable learning to handle cycles and nonlinearity. It introduces the VCSGS algorithm and a nonparametric edge-estimation framework, grounded in total variation smoothness and NZ conditions, to achieve uniformly consistent structure inference. The latent-structure portion combines GIN, rank constraints, and tensor constraints to identify cycles within latent clusters and across latent blocks, including both Gaussian and non-Gaussian noise regimes, with concrete algorithms and experiments. Collectively, the thesis broadens causal discovery applicability to systems with latent constructs and nonlinear interactions, offering principled probabilistic and algebraic tools for robust causal inference in richer data-generating processes.

Abstract

The goal of Causal Discovery is to find automated search methods for learning causal structures from observational data. In some cases all variables of the interested causal mechanism are measured, and the task is to predict the effects one measured variable has on another. In contrast, sometimes the variables of primary interest are not directly observable but instead inferred from their manifestations in the data. These are referred to as latent variables. One commonly known example is the psychological construct of intelligence, which cannot directly measured so researchers try to assess through various indicators such as IQ tests. In this case, casual discovery algorithms can uncover underlying patterns and structures to reveal the causal connections between the latent variables and between the latent and observed variables. This thesis focuses on two questions in causal discovery: providing an alternative definition of k-Triangle Faithfulness that (i) is weaker than strong faithfulness when applied to the Gaussian family of distributions, (ii) can be applied to non-Gaussian families of distributions, and (iii) under the assumption that the modified version of Strong Faithfulness holds, can be used to show the uniform consistency of a modified causal discovery algorithm; relaxing the sufficiency assumption to learn causal structures with latent variables. Given the importance of inferring cause-and-effect relationships for understanding and forecasting complex systems, the work in this thesis of relaxing various simplification assumptions is expected to extend the causal discovery method to be applicable in a wider range with diversified causal mechanism and statistical phenomena.
Paper Structure (79 sections, 32 theorems, 36 equations, 33 figures, 8 algorithms)

This paper contains 79 sections, 32 theorems, 36 equations, 33 figures, 8 algorithms.

Key Result

Lemma 2.2.1

Given an ancestral set $\mathbf{A\subset V}$ that contains the parents of $Y$ but not $Y$: If $X$ is a parent of $Y$: $T^{|\mathbf{A}|} e(X\rightarrow Y)\leq\epsilon_{X,Y|\mathbf{A}\setminus\{X\}}\leq e(X\rightarrow Y)$

Figures (33)

  • Figure 1: a 4-trek on $\{X_1, X_2, X_3, X_4\}$
  • Figure 2: tree $T_p$, single cycle $C_p$ and bipartite $K_{2,p-2}$ with $p$ many nodes Uhler_2013
  • Figure 3: Proportion of $\lambda$-strong-unfaithful distributions and $k$-triangle-unfaithful distributions for 3 values of $\lambda$ and $k$
  • Figure 4: The maximum value of $k$ for the 10-nodes DAGs with 5 to 9 expected neighborhood size to fully satisfy the $k$-Triangle-Faithfulness Assumption.
  • Figure 5: Proportion of $k$-triangle-unfaithful distributions for different average neighborhood size for graphs with 8 vertices and $k=0.1$nb_ size stands for average neighborhood size..
  • ...and 28 more figures

Theorems & Definitions (55)

  • Lemma 2.2.1
  • proof
  • Lemma 2.2.2
  • proof
  • Theorem 2.2.3
  • proof
  • proof
  • Lemma 2.2.4
  • proof
  • Theorem 2.3.1
  • ...and 45 more