Table of Contents
Fetching ...

ProDAG: Projected Variational Inference for Directed Acyclic Graphs

Ryan Thompson, Edwin V. Bonilla, Robert Kohn

TL;DR

ProDAG tackles DAG structure learning with uncertainty quantification by introducing a Bayesian variational framework whose priors and posteriors are distributions over DAGs obtained via a projection of a continuous matrix onto the space of sparse acyclic weighted adjacency matrices, i.e. $W=\operatorname{pro}_\lambda(\tilde{W})$. The projection-based approach enforces exact acyclicity and sparsity while enabling GPU-accelerated continuous optimization and analytic gradients via the implicit function theorem. The method extends to nonlinear SEMs and demonstrates superior uncertainty quantification and structure recovery on linear and nonlinear synthetic data and real data (e.g., Sachs) relative to state-of-the-art baselines. This work provides a scalable, uncertainty-aware DAG learning framework with open-source tooling and broad applicability to causal discovery tasks.

Abstract

Directed acyclic graph (DAG) learning is a central task in structure discovery and causal inference. Although the field has witnessed remarkable advances over the past few years, it remains statistically and computationally challenging to learn a single (point estimate) DAG from data, let alone provide uncertainty quantification. We address the difficult task of quantifying graph uncertainty by developing a Bayesian variational inference framework based on novel, provably valid distributions that have support directly on the space of sparse DAGs. These distributions, which we use to define our prior and variational posterior, are induced by a projection operation that maps an arbitrary continuous distribution onto the space of sparse weighted acyclic adjacency matrices. While this projection is combinatorial, it can be solved efficiently using recent continuous reformulations of acyclicity constraints. We empirically demonstrate that our method, ProDAG, can outperform state-of-the-art alternatives in both accuracy and uncertainty quantification.

ProDAG: Projected Variational Inference for Directed Acyclic Graphs

TL;DR

ProDAG tackles DAG structure learning with uncertainty quantification by introducing a Bayesian variational framework whose priors and posteriors are distributions over DAGs obtained via a projection of a continuous matrix onto the space of sparse acyclic weighted adjacency matrices, i.e. . The projection-based approach enforces exact acyclicity and sparsity while enabling GPU-accelerated continuous optimization and analytic gradients via the implicit function theorem. The method extends to nonlinear SEMs and demonstrates superior uncertainty quantification and structure recovery on linear and nonlinear synthetic data and real data (e.g., Sachs) relative to state-of-the-art baselines. This work provides a scalable, uncertainty-aware DAG learning framework with open-source tooling and broad applicability to causal discovery tasks.

Abstract

Directed acyclic graph (DAG) learning is a central task in structure discovery and causal inference. Although the field has witnessed remarkable advances over the past few years, it remains statistically and computationally challenging to learn a single (point estimate) DAG from data, let alone provide uncertainty quantification. We address the difficult task of quantifying graph uncertainty by developing a Bayesian variational inference framework based on novel, provably valid distributions that have support directly on the space of sparse DAGs. These distributions, which we use to define our prior and variational posterior, are induced by a projection operation that maps an arbitrary continuous distribution onto the space of sparse weighted acyclic adjacency matrices. While this projection is combinatorial, it can be solved efficiently using recent continuous reformulations of acyclicity constraints. We empirically demonstrate that our method, ProDAG, can outperform state-of-the-art alternatives in both accuracy and uncertainty quantification.
Paper Structure (41 sections, 3 theorems, 48 equations, 15 figures, 4 tables, 3 algorithms)

This paper contains 41 sections, 3 theorems, 48 equations, 15 figures, 4 tables, 3 algorithms.

Key Result

Theorem 1

Let $\tilde{W}$ be endowed with a continuous probability measure. Then it holds:

Figures (15)

  • Figure 1: Illustration of ProDAG's projected distributions. Samples (blue dots) from an unconstrained continuous space (blue ellipse) are projected onto the nearest acyclic matrix within an $\ell_1$-constrained region (orange diamonds). Projected samples (orange dots) satisfy acyclicity and sparsity constraints. Theoretically, we show that for any continuous $\tilde{W}\sim\mathbb{P}$ the projection $\operatorname{pro}_\lambda(\tilde{W})$ is unique and measurable. This result implies that the projected distribution is a valid distribution over DAGs.
  • Figure 2: Computation times in seconds with a sample size $n=100$. The averages (solid points) and standard errors (error bars) are measured over 10 independently and identically generated datasets.
  • Figure 3: Performance on synthetic datasets generated from linear Erdős–Rényi DAGs with $p=20$ nodes, $s=40$ edges, and Gaussian noise. The averages (solid points) and standard errors (error bars) are measured over 10 independently and identically generated datasets.
  • Figure 4: Performance on synthetic datasets generated from linear Erdős–Rényi DAGs with $p=100$ nodes, $s=200$ edges, and Gaussian noise. The averages (solid points) and standard errors (error bars) are measured over 10 independently and identically generated datasets.
  • Figure 5: Performance on synthetic datasets generated from nonlinear Erdős–Rényi DAGs with $p=10$ nodes, $s=20$ edges, and Gaussian noise. The averages (solid points) and standard errors (error bars) are measured over 10 independently and identically generated datasets.
  • ...and 10 more figures

Theorems & Definitions (6)

  • Theorem 1
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • proof