Hybrid Top-Down Global Causal Discovery with Local Search for Linear and Nonlinear Additive Noise Models
Sujai Hiremath, Jacqueline R. M. A. Maasch, Mengxiao Gao, Promit Ghosal, Kyra Gan
TL;DR
The paper tackles scalable causal discovery from observational data by combining local-to-global reasoning with top-down and nonparametric edge pruning. It introduces two hierarchical topological sort algorithms: LHTS for linear non-Gaussian ANMs and NHTS for nonlinear ANMs, each exploiting local ancestral or parental relationships to reduce regressions and conditioning set sizes. A nonparametric constraint-based edge discovery (ED) step then prunes spurious edges using targeted conditioning, with theoretical guarantees and polynomial-time complexities. Empirically, the methods yield higher topological accuracy and edge-recovery rates than state-of-the-art baselines on synthetic data, particularly in sparse graphs, while offering favorable runtimes for nonlinear settings. Collectively, the work advances efficient, flexible causal discovery across linear and nonlinear regimes with measurable practical impact for high-dimensional applications.
Abstract
Learning the unique directed acyclic graph corresponding to an unknown causal model is a challenging task. Methods based on functional causal models can identify a unique graph, but either suffer from the curse of dimensionality or impose strong parametric assumptions. To address these challenges, we propose a novel hybrid approach for global causal discovery in observational data that leverages local causal substructures. We first present a topological sorting algorithm that leverages ancestral relationships in linear structural causal models to establish a compact top-down hierarchical ordering, encoding more causal information than linear orderings produced by existing methods. We demonstrate that this approach generalizes to nonlinear settings with arbitrary noise. We then introduce a nonparametric constraint-based algorithm that prunes spurious edges by searching for local conditioning sets, achieving greater accuracy than current methods. We provide theoretical guarantees for correctness and worst-case polynomial time complexities, with empirical validation on synthetic data.
