Table of Contents
Fetching ...

Fast Proxy Experiment Design for Causal Effect Identification

Sepehr Elahi, Sina Akbari, Jalal Etesami, Negar Kiyavash, Patrick Thiran

TL;DR

This paper tackles identifying causal effects via proxy experiments by reframing the minimum-cost intervention design (MCID) problem for causal identification in ADMGs. It introduces reformulations of MCID as a weighted partial MAX-SAT (WPMAX-SAT) problem and as an ILP, along with submodular and reinforcement-learning perspectives, enabling faster exact solutions. A SAT-construction procedure and extensions to multiple districts yield practical, scalable solvers that significantly outperform the previous state of the art, with experimental results showing substantial reductions in runtime. Additionally, the authors develop a generalized-adjustment framework that provides a polynomial-time heuristic and a principled route to identify adjustment-based designs, enhancing applicability to real-world settings where costly interventions must be minimized while ensuring identifiability.

Abstract

Identifying causal effects is a key problem of interest across many disciplines. The two long-standing approaches to estimate causal effects are observational and experimental (randomized) studies. Observational studies can suffer from unmeasured confounding, which may render the causal effects unidentifiable. On the other hand, direct experiments on the target variable may be too costly or even infeasible to conduct. A middle ground between these two approaches is to estimate the causal effect of interest through proxy experiments, which are conducted on variables with a lower cost to intervene on compared to the main target. Akbari et al. [2022] studied this setting and demonstrated that the problem of designing the optimal (minimum-cost) experiment for causal effect identification is NP-complete and provided a naive algorithm that may require solving exponentially many NP-hard problems as a sub-routine in the worst case. In this work, we provide a few reformulations of the problem that allow for designing significantly more efficient algorithms to solve it as witnessed by our extensive simulations. Additionally, we study the closely-related problem of designing experiments that enable us to identify a given effect through valid adjustments sets.

Fast Proxy Experiment Design for Causal Effect Identification

TL;DR

This paper tackles identifying causal effects via proxy experiments by reframing the minimum-cost intervention design (MCID) problem for causal identification in ADMGs. It introduces reformulations of MCID as a weighted partial MAX-SAT (WPMAX-SAT) problem and as an ILP, along with submodular and reinforcement-learning perspectives, enabling faster exact solutions. A SAT-construction procedure and extensions to multiple districts yield practical, scalable solvers that significantly outperform the previous state of the art, with experimental results showing substantial reductions in runtime. Additionally, the authors develop a generalized-adjustment framework that provides a polynomial-time heuristic and a principled route to identify adjustment-based designs, enhancing applicability to real-world settings where costly interventions must be minimized while ensuring identifiability.

Abstract

Identifying causal effects is a key problem of interest across many disciplines. The two long-standing approaches to estimate causal effects are observational and experimental (randomized) studies. Observational studies can suffer from unmeasured confounding, which may render the causal effects unidentifiable. On the other hand, direct experiments on the target variable may be too costly or even infeasible to conduct. A middle ground between these two approaches is to estimate the causal effect of interest through proxy experiments, which are conducted on variables with a lower cost to intervene on compared to the main target. Akbari et al. [2022] studied this setting and demonstrated that the problem of designing the optimal (minimum-cost) experiment for causal effect identification is NP-complete and provided a naive algorithm that may require solving exponentially many NP-hard problems as a sub-routine in the worst case. In this work, we provide a few reformulations of the problem that allow for designing significantly more efficient algorithms to solve it as witnessed by our extensive simulations. Additionally, we study the closely-related problem of designing experiments that enable us to identify a given effect through valid adjustments sets.
Paper Structure (29 sections, 17 theorems, 17 equations, 12 figures)

This paper contains 29 sections, 17 theorems, 17 equations, 12 figures.

Key Result

Proposition 1

Let $\mathcal{G}$ be an ADMG over the vertices $V$. Also let $X,Y\subseteq V$ be disjoint sets of variables. Define $S=\mathrm{Anc}_{V\setminus X}(Y)$, and let $\boldsymbol{\mathcal{S}}=\{S_1,\dots,S_r\}$ be the (unique) set of maximal districts in $\mathcal{G}[S]$. The interventional distribution $

Figures (12)

  • Figure 1: The average runtime of our approach compared with the state-of-the-art (S.O.T.A) from akbari-2022.
  • Figure 3: Average time taken by Algorithm 2 of akbari-2022 (MHS), ILP, and WPMAX-SAT to solve one graph versus (a) the number of vertices in the graph and (b) the number of districts of $S$.
  • Figure 4: Average normalized cost of the heuristic algorithms $H_1$ and $H_2$ of akbari-2022 and \ref{['alg:adj']} versus the number of vertices in the graph.
  • Figure 5: Semi-log plot of the average time taken by WPMAX-SAT to solve one graph versus the number of vertices in the graph.
  • Figure 6: Heatmap of the average time taken by WPMAX-SAT (on the left) and Algorithm 2 of akbari-2022 (on the right) to solve one graph versus the probabilities of directed and bidirected edges in the graph.
  • ...and 7 more figures

Theorems & Definitions (35)

  • Example 1
  • Definition 1: Identifiability
  • Remark 1
  • Definition 2: Hedge
  • Remark 2
  • Definition 3: Hedge hull akbari-2022
  • Proposition 1
  • Proposition 2: akbari-2022
  • Theorem 1
  • Lemma 1
  • ...and 25 more