Less Greedy Equivalence Search
Adiba Ejaz, Elias Bareinboim
TL;DR
This work introduces Less Greedy Equivalence Search (LGES), a family of score-based causal discovery algorithms designed to overcome the scalability and finite-sample limitations of Greedy Equivalence Search (GES). By replacing the strict highest-scoring forward insertion with ConservativeInsert and SafeInsert strategies, LGES achieves significant speedups (up to $10$-fold) and reduces structural errors, while accommodating prior knowledge through guided search and remaining asymptotically correct under misspecification. The authors further extend the framework with I-Orient, a scalable, score-based procedure that refines observational MECs using interventional data, enabling identification of a smaller, more informative I-MEC. Empirical results across synthetic and real-world datasets demonstrate LGES’s robustness to misspecification, superior accuracy, and greater scalability compared with GES, PC, and NoTears, with interventional data providing additional gains. Collectively, the methods offer a practical, theoretically sound pathway for robust causal discovery in high-dimensional settings and with heterogeneous data sources.
Abstract
Greedy Equivalence Search (GES) is a classic score-based algorithm for causal discovery from observational data. In the sample limit, it recovers the Markov equivalence class of graphs that describe the data. Still, it faces two challenges in practice: computational cost and finite-sample accuracy. In this paper, we develop Less Greedy Equivalence Search (LGES), a variant of GES that retains its theoretical guarantees while partially addressing these limitations. LGES modifies the greedy step; rather than always applying the highest-scoring insertion, it avoids edge insertions between variables for which the score implies some conditional independence. This more targeted search yields up to a \(10\)-fold speed-up and a substantial reduction in structural error relative to GES. Moreover, LGES can guide the search using prior knowledge, and can correct this knowledge when contradicted by data. Finally, LGES can use interventional data to refine the learned observational equivalence class. We prove that LGES recovers the true equivalence class in the sample limit, even with misspecified knowledge. Experiments demonstrate that LGES outperforms GES and other baselines in speed, accuracy, and robustness to misspecified knowledge. Our code is available at https://github.com/CausalAILab/lges.
