Analytic DAG Constraints for Differentiable DAG Learning
Zhen Zhang, Ignavier Ng, Dong Gong, Yuhang Liu, Mingming Gong, Biwei Huang, Kun Zhang, Anton van den Hengel, Javen Qinfeng Shi
TL;DR
This work rethinks differentiable DAG learning by casting DAG constraints as trace conditions of analytic matrix functions. By defining a flexible function class $\mathcal{F}$ and proving closure under differentiation, addition, and multiplication, it unifies prior constraints and enables the construction of higher-order, gradient-rich constraints with finite convergence radii. A path-following optimization framework with Hadamard-mapped adjacency matrices and $\ell_1$ regularization is proposed to mitigate gradient vanishing and numerical instability, with efficient gradient computation through caching and logarithmic-time updates. Across linear and nonlinear experiments, higher-order analytic DAG constraints consistently reduce gradient vanishing and improve structure recovery compared to state-of-the-art baselines, demonstrating robust DAG discovery under known and unknown data scales. The approach also provides theoretical insights into non-convexity via Hessian spectral radius, informing the trade-offs between gradient strength and optimization difficulty.
Abstract
Recovering the underlying Directed Acyclic Graph (DAG) structures from observational data presents a formidable challenge, partly due to the combinatorial nature of the DAG-constrained optimization problem. Recently, researchers have identified gradient vanishing as one of the primary obstacles in differentiable DAG learning and have proposed several DAG constraints to mitigate this issue. By developing the necessary theory to establish a connection between analytic functions and DAG constraints, we demonstrate that analytic functions from the set $\{f(x) = c_0 + \sum_{i=1}^{\infty}c_ix^i | \forall i > 0, c_i > 0; r = \lim_{i\rightarrow \infty}c_{i}/c_{i+1} > 0\}$ can be employed to formulate effective DAG constraints. Furthermore, we establish that this set of functions is closed under several functional operators, including differentiation, summation, and multiplication. Consequently, these operators can be leveraged to create novel DAG constraints based on existing ones. Using these properties, we design a series of DAG constraints and develop an efficient algorithm to evaluate them. Experiments in various settings demonstrate that our DAG constraints outperform previous state-of-the-art comparators. Our implementation is available at https://github.com/zzhang1987/AnalyticDAGLearning.
