Choosing DAG Models Using Markov and Minimal Edge Count in the Absence of Ground Truth

Joseph D. Ramsey; Bryan Andrews; Peter Spirtes

Choosing DAG Models Using Markov and Minimal Edge Count in the Absence of Ground Truth

Joseph D. Ramsey, Bryan Andrews, Peter Spirtes

TL;DR

The paper tackles the challenge of validating causal DAG/CPDAG models from data without ground truth by introducing the Markov Checker, a nonparametric pointwise consistent test of the Markov condition, and CAFS, a cross-algorithm search that favors models with minimal edges among those passing the test. The core idea is to assess whether the p-values from conditional-independence tests implied by a candidate graph are uniformly distributed under the null, enabling rejection of models that do not align with the population distribution, and then select frugal models. Key contributions include (i) formalizing MC testing via uniformity of p-values, (ii) relaxing the sparsest representation criterion to frugality and enabling comparisons across diverse algorithms, (iii) a practical software implementation within the Tetrad suite with large-scale capability, and (iv) simulation and empirical demonstrations showing that near-ground-truth models can be identified without ground-truth labels. The approach offers a scalable, ground-truth-free tool for domain experts to prune and tune causal structure learning pipelines, particularly for dense graphs where traditional reliability guarantees may fail.

Abstract

We give a novel nonparametric pointwise consistent statistical test (the Markov Checker) of the Markov condition for directed acyclic graph (DAG) or completed partially directed acyclic graph (CPDAG) models given a dataset. We also introduce the Cross-Algorithm Frugality Search (CAFS) for rejecting DAG models that either do not pass the Markov Checker test or that are not edge minimal. Edge minimality has been used previously by Raskutti and Uhler as a nonparametric simplicity criterion, though CAFS readily generalizes to other simplicity conditions. Reference to the ground truth is not necessary for CAFS, so it is useful for finding causal structure learning algorithms and tuning parameter settings that output causal models that are approximately true from a given data set. We provide a software tool for this analysis that is suitable for even quite large or dense models, provided a suitably fast pointwise consistent test of conditional independence is available. In addition, we show in simulation that the CAFS procedure can pick approximately correct models without knowing the ground truth.

Choosing DAG Models Using Markov and Minimal Edge Count in the Absence of Ground Truth

TL;DR

Abstract

Paper Structure (15 sections, 4 theorems, 4 figures, 1 table, 2 algorithms)

This paper contains 15 sections, 4 theorems, 4 figures, 1 table, 2 algorithms.

Introduction
Preliminaries
The Markov Condition
Repurposing Raskutti and Uhler's Recommendations
A Test of the Markov Condition
Data Overlap
The CAFS Procedure
Implementation
Expanding the Toolkit for Domain Experts
Limitations
A Simulation Comparison
An Empirical Example
Discussion
Conclusion
Acknowledgements

Key Result

Theorem 1

Let $G$ and $H$ be two DAGs such that the set of conditional independences implied by $G$ is a subset of the set of conditional independences implied by H, that is, $G$ is an I-map of $H$. If $G$ contains the v-structure $X \rightarrow Z \leftarrow Y$, then either H contains the same v-structure or

Figures (4)

Figure 1: A comparison of p-value uniformity for conditional independence testing, for (i) the dataset randomly subsetted without replacement to sample size $n / 2$, (ii) the original dataset, without subsetting, and (iii) a newly sampled dataset.
Figure 2: Scatter plots of six statistics for the estimated CPDAGs against the Anderson-Darling p-value ($p_ad$) of a 20-bin histogram of the Markov checker p-values for each algorithm variant from expected p-values from a 20-bin $U(0, 1)$ histogram.
Figure 3: For the US Crime data, scatter plots of four statistics for the estimated CPDAGs against the Kullback-Leibler divergence ($kldiv$) of a 20-bin histogram of the Markov checker p-values for each algorithm variant from expected p-values from a 20-bin $U(0, 1)$ histogram.
Figure 4: The result of applying the FGES algorithms with $\lambda = 2$ to the US Crime data. The rationale for selecting this model from among the models produced by various algorithms is given in the text.

Theorems & Definitions (11)

Definition 1: Global MC
Definition 2: CMC
Definition 3: Local MC
Definition 4: Ordered Local MC
Definition 5: CFC
Theorem 1: Chickering Lemma 28
Definition 6
Theorem 2: Raskutti and Uhler, Theorem 2.3
Corollary 1: Raskutti and Uhler Relaxation
Theorem 3: Uniformity Check
...and 1 more

Choosing DAG Models Using Markov and Minimal Edge Count in the Absence of Ground Truth

TL;DR

Abstract

Choosing DAG Models Using Markov and Minimal Edge Count in the Absence of Ground Truth

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (11)