ExDAG: an MIQP Algorithm for Learning DAGs
Pavel Rytir, Ales Wodecki, Jakub Marecek
TL;DR
This work tackles learning causal DAG structures by formulating the problem as a mixed-integer quadratic program (MIQP) that yields a maximum likelihood estimator under a structural equation model $X = XW + Z$ with $W$ acyclic. ExDAG solves the MIQP via a branch-and-bound-and-cut framework that adds cycle-exclusion constraints lazily, enabling global convergence without enumerating all cycles beforehand; dual bounds further allow real-time quality assessment. The method demonstrates superior structural Hamming distance and $F_1$ scores on medium-sized graphs (up to $d \le 25$) under Gaussian noise, and shows promising results on a NeurIPS real dataset, outperforming NOTEARS and other baselines. This approach offers scalable, provably global DAG learning with practical applicability in causal inference and related graphical-model tasks, and suggests avenues for future work on graph decompositions and pre-processing to further broaden scalability.
Abstract
There has been a growing interest in causal learning in recent years. Commonly used representations of causal structures, including Bayesian networks and structural equation models (SEM), take the form of directed acyclic graphs (DAGs). We provide a novel mixed-integer quadratic programming formulation and an associated algorithm that identifies DAGs with a low structural Hamming distance between the identified DAG and the ground truth, under identifiability assumptions. The eventual exact learning is guaranteed by the global convergence of the branch-and-bound-and-cut algorithm, which is utilized. In addition to this, integer programming techniques give us access to the dual bound, which allows for a real time assessment of the quality of solution. Previously, integer programming techniques have been shown to lead to limited scaling in the case of DAG identification due to the super exponential number of constraints, which prevent the formation of cycles. The algorithm proposed circumvents this by selectively generating only the violated constraints using the so-called "lazy" constraints methodology. Our empirical results show that ExDAG outperforms state-of-the-art solvers in terms of structural Hamming distance and $F_1$ score when considering Gaussian noise on medium-sized graphs.
