Table of Contents
Fetching ...

Causal Discovery from Poisson Branching Structural Causal Model Using High-Order Cumulant with Path Analysis

Jie Qiao, Yu Xiang, Zhengming Chen, Ruichu Cai, Zhifeng Hao

Abstract

Count data naturally arise in many fields, such as finance, neuroscience, and epidemiology, and discovering causal structure among count data is a crucial task in various scientific and industrial scenarios. One of the most common characteristics of count data is the inherent branching structure described by a binomial thinning operator and an independent Poisson distribution that captures both branching and noise. For instance, in a population count scenario, mortality and immigration contribute to the count, where survival follows a Bernoulli distribution, and immigration follows a Poisson distribution. However, causal discovery from such data is challenging due to the non-identifiability issue: a single causal pair is Markov equivalent, i.e., $X\rightarrow Y$ and $Y\rightarrow X$ are distributed equivalent. Fortunately, in this work, we found that the causal order from $X$ to its child $Y$ is identifiable if $X$ is a root vertex and has at least two directed paths to $Y$, or the ancestor of $X$ with the most directed path to $X$ has a directed path to $Y$ without passing $X$. Specifically, we propose a Poisson Branching Structure Causal Model (PB-SCM) and perform a path analysis on PB-SCM using high-order cumulants. Theoretical results establish the connection between the path and cumulant and demonstrate that the path information can be obtained from the cumulant. With the path information, causal order is identifiable under some graphical conditions. A practical algorithm for learning causal structure under PB-SCM is proposed and the experiments demonstrate and verify the effectiveness of the proposed method.

Causal Discovery from Poisson Branching Structural Causal Model Using High-Order Cumulant with Path Analysis

Abstract

Count data naturally arise in many fields, such as finance, neuroscience, and epidemiology, and discovering causal structure among count data is a crucial task in various scientific and industrial scenarios. One of the most common characteristics of count data is the inherent branching structure described by a binomial thinning operator and an independent Poisson distribution that captures both branching and noise. For instance, in a population count scenario, mortality and immigration contribute to the count, where survival follows a Bernoulli distribution, and immigration follows a Poisson distribution. However, causal discovery from such data is challenging due to the non-identifiability issue: a single causal pair is Markov equivalent, i.e., and are distributed equivalent. Fortunately, in this work, we found that the causal order from to its child is identifiable if is a root vertex and has at least two directed paths to , or the ancestor of with the most directed path to has a directed path to without passing . Specifically, we propose a Poisson Branching Structure Causal Model (PB-SCM) and perform a path analysis on PB-SCM using high-order cumulants. Theoretical results establish the connection between the path and cumulant and demonstrate that the path information can be obtained from the cumulant. With the path information, causal order is identifiable under some graphical conditions. A practical algorithm for learning causal structure under PB-SCM is proposed and the experiments demonstrate and verify the effectiveness of the proposed method.
Paper Structure (31 sections, 20 theorems, 96 equations, 12 figures, 2 tables, 1 algorithm)

This paper contains 31 sections, 20 theorems, 96 equations, 12 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

Given a Poisson random variable $\epsilon$ and $n$ distinct sequences of coefficients $A_1 ,...,A_n$, we have where each $A_i \circ \epsilon$ repeats $k_i\ge 1$ times in the original cumulant and only appears once in the reduced cumulant.

Figures (12)

  • Figure 1: Illustration of branching structure causal modeling.
  • Figure 1: Generating process of $A_1 \circ \epsilon , A_2 \circ \epsilon, \ A_3 \circ \epsilon$, each leaf node corresponds to an original random variable.
  • Figure 2: Triangular structure. For simplicity, we denote directed path $P_1 : X_1\xrightarrow{a} X_2$ and $P_2 :X_1\xrightarrow{b_1} X_3\xrightarrow{b_2} X_2$ with sequence of path coefficients $A_1 =(a)$ and $A_2 =(b_1 ,b_2)$.
  • Figure 2: Obtain conditional independence according to the hierarchical structure of the tree
  • Figure 3: Illustration of decomposing the cumulant of causal direction, $\mathcal{C}_{1,2}$ and $\mathcal{C}_{3,2}$, in triangular structure (Fig. \ref{['fig:triangular_structure']}). For simplicity, we denote $\kappa(\epsilon_i,A_i\circ \epsilon_i,...,A_j\circ \epsilon_i)$ by $(1,A_i,...,A_j)$ and denote $\kappa(b_1\circ\epsilon_i,A_i\circ \epsilon_i,...,A_j\circ \epsilon_i)$ by $(b_1,A_i,...,A_j).$
  • ...and 7 more figures

Theorems & Definitions (42)

  • Definition 1: Poisson Branching Structural Causal Model
  • Definition 2: k-th order joint cumulant tensor
  • Definition 3: 2D slice of joint cumulant tensor
  • Theorem 1: Reducibility
  • Remark 1
  • Definition 4: $k$-path cumulants summation for root vertex
  • Theorem 2
  • Theorem 3: Identifiability for root vertex
  • Theorem 4: Graphical Implication of Identifiability for Root Vertex
  • Definition 5: $k$-path cumulants summation
  • ...and 32 more