Dagma-DCE: Interpretable, Non-Parametric Differentiable Causal Discovery

Daniel Waxman; Kurt Butler; Petar M. Djuric

Dagma-DCE: Interpretable, Non-Parametric Differentiable Causal Discovery

Daniel Waxman, Kurt Butler, Petar M. Djuric

TL;DR

Dagma-DCE addresses the interpretability gap in differentiable causal discovery by redefining the adjacency between variables via the L2 derivative norm of the child functions, measured with respect to the input distribution. It presents a model-agnostic, differentiable optimization framework that enforces acyclicity through a central-path constraint and promotes sparsity via an L1 penalty on derivatives. The approach yields an interpretable, non-parametric measure of causal strength based on the differential causal effect (DCE), leading to adjacency matrices whose nonzero entries reflect true local causal influence and whose magnitudes correspond to interaction energy. Empirically, Dagma-DCE achieves competitive or state-of-the-art performance on synthetic benchmarks while enabling principled thresholding and expert-driven sparsity choices, with open-source code available for broad adoption.

Abstract

We introduce Dagma-DCE, an interpretable and model-agnostic scheme for differentiable causal discovery. Current non- or over-parametric methods in differentiable causal discovery use opaque proxies of ``independence'' to justify the inclusion or exclusion of a causal relationship. We show theoretically and empirically that these proxies may be arbitrarily different than the actual causal strength. Juxtaposed to existing differentiable causal discovery algorithms, \textsc{Dagma-DCE} uses an interpretable measure of causal strength to define weighted adjacency matrices. In a number of simulated datasets, we show our method achieves state-of-the-art level performance. We additionally show that \textsc{Dagma-DCE} allows for principled thresholding and sparsity penalties by domain-experts. The code for our method is available open-source at https://github.com/DanWaxman/DAGMA-DCE, and can easily be adapted to arbitrary differentiable models.

Dagma-DCE: Interpretable, Non-Parametric Differentiable Causal Discovery

TL;DR

Abstract

Paper Structure (21 sections, 2 theorems, 25 equations, 2 figures, 1 table)

This paper contains 21 sections, 2 theorems, 25 equations, 2 figures, 1 table.

INTRODUCTION
BACKGROUND
Structural Causal Models and Causal Discovery
Differentiable Causal Discovery
NON-INTERPRETABILITY OF Dagma
Theoretical Argument
Empirical Result with a Linear SCM
PROPOSED SOLUTION
Definition of the Optimization Problem
Dagma-DCE is Model Agnostic
Practical Considerations
INTERPRETABILITY OF Dagma-DCE
Measuring Causal Strength
Differential Causal Effect and Dagma-DCE
Empirical Results with a Linear SCM
...and 6 more sections

Key Result

Lemma 1

Let $\sigma(\cdot)$ denote the sigmoid activation function. Then for any $\delta, \epsilon> 0$, there exists an MLP $f_j$ with weight matrices $\mathbf{A}^{(1)}, \dots, \mathbf{A}^{(M)}$ such that $\lVert \mathbf{A}^{(1)}_{i\cdot} \rVert_{L^2} < \epsilon$ but $\lVert \partial_i f_j \rVert_{L^2}> \de

Figures (2)

Figure 1: The difference between the magnitude of the true derivatives in a linear causal model to the magnitude of the weighted graph in Dagma for a random $10 \times 10$ Erdös-Rényi directed graph with $20$ expected edges. Gray boxes surrounding each cell denote the magnitude of the ground-truth linear coefficient.
Figure 2: Resulting SID (top left), SHD (top right), $F_1$ Score (bottom left), and time elapsed (bottom right) for random data generated from (\ref{['fig:GP_Add_Results']}) the ER-4 GP-additive model and (\ref{['fig:MLP_Results']}) the ER-4 MLP model, as detailed in \ref{['sec:synthetic_data']}. Boxes show the median and quartiles across $T=10$ trials for Dagma and Dagma-DCE, and $T=5$ trials for Notears, with whiskers showing the minimum and maximum values.

Theorems & Definitions (2)

Lemma 1
Lemma 2

Dagma-DCE: Interpretable, Non-Parametric Differentiable Causal Discovery

TL;DR

Abstract

Dagma-DCE: Interpretable, Non-Parametric Differentiable Causal Discovery

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (2)