Table of Contents
Fetching ...

Smoothing the Landscape: Causal Structure Learning via Diffusion Denoising Objectives

Hao Zhu, Di Zhou, Donna Slonim

Abstract

Understanding causal dependencies in observational data is critical for informing decision-making. These relationships are often modeled as Bayesian Networks (BNs) and Directed Acyclic Graphs (DAGs). Existing methods, such as NOTEARS and DAG-GNN, often face issues with scalability and stability in high-dimensional data, especially when there is a feature-sample imbalance. Here, we show that the denoising score matching objective of diffusion models could smooth the gradients for faster, more stable convergence. We also propose an adaptive k-hop acyclicity constraint that improves runtime over existing solutions that require matrix inversion. We name this framework Denoising Diffusion Causal Discovery (DDCD). Unlike generative diffusion models, DDCD utilizes the reverse denoising process to infer a parameterized causal structure rather than to generate data. We demonstrate the competitive performance of DDCDs on synthetic benchmarking data. We also show that our methods are practically useful by conducting qualitative analyses on two real-world examples. Code is available at this url: https://github.com/haozhu233/ddcd.

Smoothing the Landscape: Causal Structure Learning via Diffusion Denoising Objectives

Abstract

Understanding causal dependencies in observational data is critical for informing decision-making. These relationships are often modeled as Bayesian Networks (BNs) and Directed Acyclic Graphs (DAGs). Existing methods, such as NOTEARS and DAG-GNN, often face issues with scalability and stability in high-dimensional data, especially when there is a feature-sample imbalance. Here, we show that the denoising score matching objective of diffusion models could smooth the gradients for faster, more stable convergence. We also propose an adaptive k-hop acyclicity constraint that improves runtime over existing solutions that require matrix inversion. We name this framework Denoising Diffusion Causal Discovery (DDCD). Unlike generative diffusion models, DDCD utilizes the reverse denoising process to infer a parameterized causal structure rather than to generate data. We demonstrate the competitive performance of DDCDs on synthetic benchmarking data. We also show that our methods are practically useful by conducting qualitative analyses on two real-world examples. Code is available at this url: https://github.com/haozhu233/ddcd.

Paper Structure

This paper contains 34 sections, 2 theorems, 20 equations, 13 figures, 1 table.

Key Result

theorem 1

For linear SEMs, minimizing the denoising objective in Eq. eqn:denoising_obj is equivalent to minimizing the standard SEM reconstruction loss in Eq. eqn:linear_sem_obj. Therefore, under Assumption 1-6 in section subsec:ps, the denoising objective can be used for causal structure learning. $\blacktri

Figures (13)

  • Figure 1: Model architectures of proposed models in this paper. Note that the reverse path does not generate new samples and the only focus is to learn W via the denoising objective.
  • Figure 2: The denoising objective accelerates convergence by smoothing gradients. (a) Average runtime vs. sample size (10 runs). (b) L-BFGS-B steps to convergence; NOTEARS-Denoising requires significantly fewer iterations. (c) Gradient norm trajectories during the first iteration on full data ($n=2,000$); the denoising objective yields a stable descent compared to the volatile gradients of the linear baseline. (First 5 steps omitted for clarity).
  • Figure 3: DDCD Linear and Nonlinear models demonstrate robust scalability and accuracy on synthetic benchmarks.
  • Figure 4: Inferred Causal Network around Lethal Outcome in Myocardial Infarction
  • Figure 5: Example Weight Estimates on Graphs with 1,000 nodes. Number of samples in both cases is 2,000.
  • ...and 8 more figures

Theorems & Definitions (4)

  • theorem 1: Objective Equivalence
  • proof
  • theorem 2
  • proof