Causal Discovery under Off-Target Interventions
Davin Choo, Kirankumar Shiragur, Caroline Uhler
TL;DR
The paper studies causal graph discovery under stochastic off-target interventions, where each attempted intervention on action $A_i$ affects a random subset drawn from $\\mathcal{D}_i$ with the goal of minimizing interventions. It proves a formal equivalence between off-target verification and stochastic set cover, enabling a polynomial-time adaptive policy with cost $O(\\overline{\\nu}(G^*) \\log n)$ in expectation for verification and establishing NP-hardness for better-than-$\\log n$ approximations; for search, it shows a hardness barrier and provides a polylogarithmic-approximation algorithm against $\\overline{\\nu}^{\\max}(G^*)$ with total cost $O(\\overline{\\nu}^{\\max}(G^*) \\log^4 n)$. The OffTargetSearch algorithm exploits 1/2-clique separators and recursive partitioning to orient edges within the MEC, running in polynomial time while competing against the max benchmark. Empirical results on synthetic and real graphs corroborate the theory, demonstrating competitive performance under various off-target distributions. The work lays a theoretical foundation for causal discovery with off-target interventions and outlines avenues for extending guarantees to unknown distributions and finite-sample settings.
Abstract
Causal graph discovery is a significant problem with applications across various disciplines. However, with observational data alone, the underlying causal graph can only be recovered up to its Markov equivalence class, and further assumptions or interventions are necessary to narrow down the true graph. This work addresses the causal discovery problem under the setting of stochastic interventions with the natural goal of minimizing the number of interventions performed. We propose the following stochastic intervention model which subsumes existing adaptive noiseless interventions in the literature while capturing scenarios such as fat-hand interventions and CRISPR gene knockouts: any intervention attempt results in an actual intervention on a random subset of vertices, drawn from a distribution dependent on attempted action. Under this model, we study the two fundamental problems in causal discovery of verification and search and provide approximation algorithms with polylogarithmic competitive ratios and provide some preliminary experimental results.
