Trust Your $\nabla$: Gradient-based Intervention Targeting for Causal Discovery
Mateusz Olko, Michał Zając, Aleksandra Nowak, Nino Scherrer, Yashas Annadani, Stefan Bauer, Łukasz Kuciński, Piotr Miłoś
TL;DR
Causal discovery from observational data is often underdetermined; interventions can improve identifiability but are costly. The authors propose Gradient-based Intervention Targeting (GIT), a plug-and-play algorithm that scores potential interventions by the expected gradient magnitude of the causal-structure loss, computed using imagined interventional data from the current model. When paired with gradient-based discovery like ENCO, GIT accelerates convergence, especially in low-data settings, and outperforms mutual-information–based baselines while closely matching the oracle-like GIT-privileged decisions. The work provides theoretical convergence arguments and extensive empirical results on synthetic and real graphs, highlighting GIT’s data efficiency and robustness. Overall, GIT offers a principled, gradient-driven design for active interventions that reduces experimental burdens in causal discovery and improves practical applicability across domains.
Abstract
Inferring causal structure from data is a challenging task of fundamental importance in science. Observational data are often insufficient to identify a system's causal structure uniquely. While conducting interventions (i.e., experiments) can improve the identifiability, such samples are usually challenging and expensive to obtain. Hence, experimental design approaches for causal discovery aim to minimize the number of interventions by estimating the most informative intervention target. In this work, we propose a novel Gradient-based Intervention Targeting method, abbreviated GIT, that 'trusts' the gradient estimator of a gradient-based causal discovery framework to provide signals for the intervention acquisition function. We provide extensive experiments in simulated and real-world datasets and demonstrate that GIT performs on par with competitive baselines, surpassing them in the low-data regime.
