Causal Structure Learning in Directed, Possibly Cyclic, Graphical Models

Pardis Semnani; Elina Robeva

Causal Structure Learning in Directed, Possibly Cyclic, Graphical Models

Pardis Semnani, Elina Robeva

TL;DR

This work addresses causal discovery for directed graphs that may contain cycles by assuming a distribution that is Markov and faithful to the unknown graph $G^\star$ with no latent variables. It introduces a two-step hybrid approach: first, a greedy search over partially ordered partitions guided by a sparsity-based graphical score GS to identify the Markov equivalence class (MEC) of $G^\star$, and second, two SCCR algorithms to construct a graph in that MEC, with strongly connected components aligned to the partition. The key contributions include the (i) Richardson-style Markov-equivalence characterization for cyclic graphs, (ii) a concrete, lexicographically-ordered score GS whose minimizers uniquely determine the MEC, and (iii) two practical and theoretically-grounded SCCR methods (construct-and-correct and submodular-flow-based) to realize a graph in the MEC. Experiments on simulated data up to around $n=10$ demonstrate the approach's viability, suggesting a path toward reliable causal discovery in settings with cycles and without parametric assumptions. The framework lays groundwork for relaxing latent-variable assumptions and extending to more general structural equation models in future work.

Abstract

We consider the problem of learning a directed graph $G^\star$ from observational data. We assume that the distribution which gives rise to the samples is Markov and faithful to the graph $G^\star$ and that there are no unobserved variables. We do not rely on any further assumptions regarding the graph or the distribution of the variables. Particularly, we allow for directed cycles in $G^\star$ and work in the fully non-parametric setting. Given the set of conditional independence statements satisfied by the distribution, we aim to find a directed graph which satisfies the same $d$-separation statements as $G^\star$. We propose a hybrid approach consisting of two steps. We first find a partially ordered partition of the vertices of $G^\star$ by optimizing a certain score in a greedy fashion. We prove that any optimal partition uniquely characterizes the Markov equivalence class of $G^\star$. Given an optimal partition, we propose an algorithm for constructing a graph in the Markov equivalence class of $G^\star$ whose strongly connected components correspond to the elements of the partition, and which are partially ordered according to the partial order of the partition. Our algorithm comes in two versions -- one which is provably correct and another one which performs fast in practice.

Causal Structure Learning in Directed, Possibly Cyclic, Graphical Models

TL;DR

This work addresses causal discovery for directed graphs that may contain cycles by assuming a distribution that is Markov and faithful to the unknown graph

with no latent variables. It introduces a two-step hybrid approach: first, a greedy search over partially ordered partitions guided by a sparsity-based graphical score GS to identify the Markov equivalence class (MEC) of

, and second, two SCCR algorithms to construct a graph in that MEC, with strongly connected components aligned to the partition. The key contributions include the (i) Richardson-style Markov-equivalence characterization for cyclic graphs, (ii) a concrete, lexicographically-ordered score GS whose minimizers uniquely determine the MEC, and (iii) two practical and theoretically-grounded SCCR methods (construct-and-correct and submodular-flow-based) to realize a graph in the MEC. Experiments on simulated data up to around

demonstrate the approach's viability, suggesting a path toward reliable causal discovery in settings with cycles and without parametric assumptions. The framework lays groundwork for relaxing latent-variable assumptions and extending to more general structural equation models in future work.

Abstract

We consider the problem of learning a directed graph

from observational data. We assume that the distribution which gives rise to the samples is Markov and faithful to the graph

and that there are no unobserved variables. We do not rely on any further assumptions regarding the graph or the distribution of the variables. Particularly, we allow for directed cycles in

and work in the fully non-parametric setting. Given the set of conditional independence statements satisfied by the distribution, we aim to find a directed graph which satisfies the same

-separation statements as

. We propose a hybrid approach consisting of two steps. We first find a partially ordered partition of the vertices of

by optimizing a certain score in a greedy fashion. We prove that any optimal partition uniquely characterizes the Markov equivalence class of

. Given an optimal partition, we propose an algorithm for constructing a graph in the Markov equivalence class of

whose strongly connected components correspond to the elements of the partition, and which are partially ordered according to the partial order of the partition. Our algorithm comes in two versions -- one which is provably correct and another one which performs fast in practice.

Paper Structure (18 sections, 13 theorems, 50 equations, 21 figures, 2 tables, 6 algorithms)

This paper contains 18 sections, 13 theorems, 50 equations, 21 figures, 2 tables, 6 algorithms.

Introduction
Characterization of Markov Equivalence in Directed Cyclic Graphs
Discovering the Markov equivalence class
Defining a score
Greedy optimization of the score
Discovering a Markov equivalent directed graph
SCCR algorithm 1: Construct and correct
SCCR algorithm 2: Submodular flow polyhedron
Simulations
Discovering the Markov equivalence class
SCCR algorithm 1: Construct and correct
Discovering a Markov equivalent graph
Discussion
Proofs for Section \ref{['poset discovery']}
Pseudocode of SCCR algorithm 1: Construct and correct
...and 3 more sections

Key Result

Theorem 2.6

Assume that $G_1=(V,E_1)$ and $G_2=(V,E_2)$ are two directed graphs. Then $G_1$ and $G_2$ are Markov equivalent if and only if the following conditions hold:

Figures (21)

Figure 1: An example of a causal cycle.
Figure 2: An example of a cyclic graph with strongly connected components $\{1,2,3,4\}, \{5, 6, 7\}, \{8,9\}$ obeying the partial order: $\{5,6,7\}\le_G \{1,2,3,4\}; \{5,6,7\}\le_G \{8,9\}$.
Figure 3: The local characterization of Markov equivalence for DAGs does not hold in general for all directed graphs.
Figure 4: A schematic representation of the three conditions any of which can make two vertices $a$ and $b$$p$-adjacent in a directed graph.
Figure 5: A schematic representation of a triple $(a,b,c)$ as an unshielded conductor and an unshielded non-conductor in a directed graph.
...and 16 more figures

Theorems & Definitions (65)

Definition 1.1
Definition 1.2
Example 2.1
Definition 2.2
Definition 2.3
Definition 2.4
Definition 2.5
Theorem 2.6
Remark 2.7
Definition 3.1
...and 55 more

Causal Structure Learning in Directed, Possibly Cyclic, Graphical Models

TL;DR

Abstract

Causal Structure Learning in Directed, Possibly Cyclic, Graphical Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (65)