Better Simulations for Validating Causal Discovery with the DAG-Adaptation of the Onion Method

Bryan Andrews; Erich Kummerfeld

Better Simulations for Validating Causal Discovery with the DAG-Adaptation of the Onion Method

Bryan Andrews, Erich Kummerfeld

TL;DR

This paper tackles the lack of standard simulation benchmarks for validating causal discovery algorithms by introducing the DAG-adaptation of the Onion (DaO) method, which uniformly samples correlation matrices $R$ that are Markov to a given DAG $G$. By focusing on the distribution over $R$ and incorporating scale-free DAG rewiring through $SFi$-DAG and $SFo$-DAG, DaO provides a domain-free, parameter-free, fair benchmark that avoids common artifacts such as varsortability and $R^2$-sortability biases. The authors prove that DaO yields uniform sampling over the space of correlation matrices that respect the DAG, and they provide open-source Python and R implementations. Through comparative simulations against ZARX and Tetrad designs, DaO shows distinct, more uniform model distributions, highlighting how previous simulations can spuriously favor certain causal discovery approaches. Overall, DaO offers a principled, universal standard for evaluating CDAs and lays groundwork for domain-specific extensions and larger-scale benchmarking efforts.

Abstract

The number of artificial intelligence algorithms for learning causal models from data is growing rapidly. Most ``causal discovery'' or ``causal structure learning'' algorithms are primarily validated through simulation studies. However, no widely accepted simulation standards exist and publications often report conflicting performance statistics -- even when only considering publications that simulate data from linear models. In response, several manuscripts have criticized a popular simulation design for validating algorithms in the linear case. We propose a new simulation design for generating linear models for directed acyclic graphs (DAGs): the DAG-adaptation of the Onion (DaO) method. DaO simulations are fundamentally different from existing simulations because they prioritize the distribution of correlation matrices rather than the distribution of linear effects. Specifically, the DaO method uniformly samples the space of all correlation matrices consistent with (i.e. Markov to) a DAG. We also discuss how to sample DAGs and present methods for generating DAGs with scale-free in-degree or out-degree. We compare the DaO method against two alternative simulation designs and provide implementations of the DaO method in Python and R: https://github.com/bja43/DaO_simulation. We advocate for others to adopt DaO simulations as a fair universal benchmark.

Better Simulations for Validating Causal Discovery with the DAG-Adaptation of the Onion Method

TL;DR

that are Markov to a given DAG

. By focusing on the distribution over

and incorporating scale-free DAG rewiring through

-DAG and

-DAG, DaO provides a domain-free, parameter-free, fair benchmark that avoids common artifacts such as varsortability and

-sortability biases. The authors prove that DaO yields uniform sampling over the space of correlation matrices that respect the DAG, and they provide open-source Python and R implementations. Through comparative simulations against ZARX and Tetrad designs, DaO shows distinct, more uniform model distributions, highlighting how previous simulations can spuriously favor certain causal discovery approaches. Overall, DaO offers a principled, universal standard for evaluating CDAs and lays groundwork for domain-specific extensions and larger-scale benchmarking efforts.

Abstract

Paper Structure (27 sections, 5 theorems, 28 equations, 17 figures, 4 tables, 9 algorithms)

This paper contains 27 sections, 5 theorems, 28 equations, 17 figures, 4 tables, 9 algorithms.

Introduction
Contributions
Previous Work
Why Simulate from DAGs by Sampling Correlation Matrices Uniformly?
Advantages of Directly Sampling the Correlation Matrix
Advantages of Sampling Uniformly
Background
Directed Acyclic Graphs
DAG Models
Recursive Structural Equation Models
The Onion Method
Multivariate Pearson Type II
Sampling Random DAGs
Methods
Scale-free Randomly Rewiring DAGs
...and 12 more sections

Key Result

Lemma 1

If $R_i$ is positive definite, then $R_{i+1}$ is positive definite if and only if:

Figures (17)

Figure 1: Vertex in/out-degree distributions for 100 DAGs with $|V| = 100$ and $\alpha = 10$.
Figure 2: Edge matrices for ER/SFi/SFo-DAGs with $|V| = 100$ and $\alpha = \frac{99}{2}$ (density $\frac{1}{2}$).
Figure 3: Uniformly sampled correlation matrices from DAGs with $|V| = 3$ and $|E| = 2$ corresponding to the $1 < 2 < 3$ column of Table \ref{['tab:er_dags']} with 100 repetition for each case. These are 2D projections of a 3D space, so many points are occluded by other points closer to the viewer.
Figure 4: Properties of 10 models generated from ER/SFi/SFo-DAGs with $|V| = 100$ and $\alpha = 10$.
Figure 5: Properties of 100 models generated from ER-DAGs with $|V| = 100$ and $\alpha = 10$.
...and 12 more figures

Theorems & Definitions (6)

Lemma 1
Lemma 2
Lemma 3
Lemma 4
Theorem 5
Remark 6

Better Simulations for Validating Causal Discovery with the DAG-Adaptation of the Onion Method

TL;DR

Abstract

Better Simulations for Validating Causal Discovery with the DAG-Adaptation of the Onion Method

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (17)

Theorems & Definitions (6)