Extremely Greedy Equivalence Search
Achille Nazaret, David Blei
TL;DR
The paper tackles the problem of causal structure discovery from finite data, where recovering the Markov Equivalence Class (MEC) is challenging for GES in dense graphs. It introduces eXtremely Greedy Equivalence Search (XGES), a two-pronged strategy: (i) XGES-0, a deletions-prioritized, interleaved search that preserves theoretical guarantees, and (ii) XGES, which further refines the MEC by testing deletions of early-inserted edges to escape local maxima. The authors develop an efficient CPDAG-based implementation with score-update caching and validity propagation to scale to larger, denser graphs, and provide rigorous empirical evidence showing XGES outperforms GES and variants in accuracy (SHD, F1) and speed (up to 10–30x faster in some settings). The work culminates in practical, open-source Python and C++ implementations, enabling broader adoption in large-scale causal discovery tasks and settings beyond infinite data assumptions.
Abstract
The goal of causal discovery is to learn a directed acyclic graph from data. One of the most well-known methods for this problem is Greedy Equivalence Search (GES). GES searches for the graph by incrementally and greedily adding or removing edges to maximize a model selection criterion. It has strong theoretical guarantees on infinite data but can fail in practice on finite data. In this paper, we first identify some of the causes of GES's failure, finding that it can get blocked in local optima, especially in denser graphs. We then propose eXtremely Greedy Equivalent Search (XGES), which involves a new heuristic to improve the search strategy of GES while retaining its theoretical guarantees. In particular, XGES favors deleting edges early in the search over inserting edges, which reduces the possibility of the search ending in local optima. A further contribution of this work is an efficient algorithmic formulation of XGES (and GES). We benchmark XGES on simulated datasets with known ground truth. We find that XGES consistently outperforms GES in recovering the correct graphs, and it is 10 times faster. XGES implementations in Python and C++ are available at https://github.com/ANazaret/XGES.
