Linear-Time Algorithms for Front-Door Adjustment in Causal Graphs
Marcel Wienöbst, Benito van der Zander, Maciej Liśkiewicz
TL;DR
Problem: Identify and estimate the total causal effect $P({\bf y} | do({\bf x}))$ when unobserved confounding precludes covariate adjustment, by using front-door sets in a DAG. Approach: develop linear-time algorithms that (i) find a front-door adjustment set ${\bf Z}$ in $O(n+m)$, (ii) enumerate all front-door sets with $O(n(n+m))$ delay, and (iii) compute an inclusion-minimal front-door set in $O(n+m)$ time, exploiting Bayes-Ball for $d$-separation and a linear-time forbidden-vertex propagation. Contributions: first linear-time FD set finder, $O(n(n+m))$-delay enumeration, and a linear-time minimal-FD-set finder, with multi-language implementations and large-scale empirical validation. Significance: speeds up front-door identifications to be comparable with back-door methods, enabling practical causal effect estimation in large DAGs; future work includes finding true minimum-size FD sets efficiently and handling causal discovery over Markov-equivalence classes.
Abstract
Causal effect estimation from observational data is a fundamental task in empirical sciences. It becomes particularly challenging when unobserved confounders are involved in a system. This paper focuses on front-door adjustment -- a classic technique which, using observed mediators allows to identify causal effects even in the presence of unobserved confounding. While the statistical properties of the front-door estimation are quite well understood, its algorithmic aspects remained unexplored for a long time. In 2022, Jeong, Tian, and Bareinboim presented the first polynomial-time algorithm for finding sets satisfying the front-door criterion in a given directed acyclic graph (DAG), with an $O(n^3(n+m))$ run time, where $n$ denotes the number of variables and $m$ the number of edges of the causal graph. In our work, we give the first linear-time, i.e., $O(n+m)$, algorithm for this task, which thus reaches the asymptotically optimal time complexity. This result implies an $O(n(n+m))$ delay enumeration algorithm of all front-door adjustment sets, again improving previous work by a factor of $n^3$. Moreover, we provide the first linear-time algorithm for finding a minimal front-door adjustment set. We offer implementations of our algorithms in multiple programming languages to facilitate practical usage and empirically validate their feasibility, even for large graphs.
