Table of Contents
Fetching ...

BudgetIV: Optimal Partial Identification of Causal Effects with Mostly Invalid Instruments

Jordan Penn, Lee M. Gunderson, Gecia Bravo-Hermsdorff, Ricardo Silva, David S. Watson

Abstract

Instrumental variables (IVs) are widely used to estimate causal effects in the presence of unobserved confounding between exposure and outcome. An IV must affect the outcome exclusively through the exposure and be unconfounded with the outcome. We present a framework for relaxing either or both of these strong assumptions with tuneable and interpretable budget constraints. Our algorithm returns a feasible set of causal effects that can be identified exactly given relevant covariance parameters. The feasible set may be disconnected but is a finite union of convex subsets. We discuss conditions under which this set is sharp, i.e., contains all and only effects consistent with the background assumptions and the joint distribution of observable variables. Our method applies to a wide class of semiparametric models, and we demonstrate how its ability to select specific subsets of instruments confers an advantage over convex relaxations in both linear and nonlinear settings. We also adapt our algorithm to form confidence sets that are asymptotically valid under a common statistical assumption from the Mendelian randomization literature.

BudgetIV: Optimal Partial Identification of Causal Effects with Mostly Invalid Instruments

Abstract

Instrumental variables (IVs) are widely used to estimate causal effects in the presence of unobserved confounding between exposure and outcome. An IV must affect the outcome exclusively through the exposure and be unconfounded with the outcome. We present a framework for relaxing either or both of these strong assumptions with tuneable and interpretable budget constraints. Our algorithm returns a feasible set of causal effects that can be identified exactly given relevant covariance parameters. The feasible set may be disconnected but is a finite union of convex subsets. We discuss conditions under which this set is sharp, i.e., contains all and only effects consistent with the background assumptions and the joint distribution of observable variables. Our method applies to a wide class of semiparametric models, and we demonstrate how its ability to select specific subsets of instruments confers an advantage over convex relaxations in both linear and nonlinear settings. We also adapt our algorithm to form confidence sets that are asymptotically valid under a common statistical assumption from the Mendelian randomization literature.

Paper Structure

This paper contains 54 sections, 9 theorems, 83 equations, 8 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

Assume eqn:Zeqn:Xeqn:Y and claims (B$1^*$), (B$2^*$) hold for some $d_{\bm \Phi} \leq d_{\bm Z}$. Assume the existence of a ground truth joint distribution $P(\bm X, Y, \bm Z)$ with finite covariance parameters $\bm{\beta_\Phi}^*, \bm{\beta_y}^*$. Then the causal parameter $\bm \theta^*$ can be iden

Figures (8)

  • Figure 1: Acyclic directed mixed graph for our problem setup. Solid circles represent observable variables and dashed circles latent variables. Bidirected arrows are interpreted as any mutual dependence between noise residuals. The dotted black arrow indicates the unobserved confounding between $\boldsymbol{X}$ and $Y$. The relevance assumption (A$1$) requires at least one of the blue arrows. The red arrows contribute to violations of the exogeneity conditions (A2) and (A3). The green arrow indicates the causal effect of interest.
  • Figure 2: The topology of a feasible set depends on the shape of the background constraints. Plots of constrained search spaces $\bm \Gamma$ (shaded) and lines $h (\theta)$ corresponding to $d_{\boldsymbol{Z}} = 2, d_{\Phi} = 1$. The intersection between a line and shaded region determines a feasible set of $\theta$. The constraints are a subspace of $\mathbb{R}^{d_{\boldsymbol{Z}}}$ while $h (\theta)$ are $d_{\Phi}$-dimensional affine subspaces. (Left) Convex $\boldsymbol{\Gamma}$ entails convex feasible sets. (Right) Budget constraints $\bm \Gamma (\bm \tau, \bm b)$ form a star domain. They are non-convex in general and are unbounded for $b_K < d_{\bm Z}$ (i.e., some $\gamma_{g_i}$ may be unconstrained). This can lead to disconnected or even unidentifiable causal effect. In \ref{['app:Unidentifiability']} we show that unidenfitiability occurs only under violation of (B$1^*$) and can be tested in polytime.
  • Figure 3: Our method yields sharper bounds than convex relaxations in linear models. Bounds on $\theta$ for a series of linear models with scalar exposure $X$ and $\bm \gamma_{\bm g}^* = (-2, -0.4)$. Plug-in estimators are used throughout. Orange bounds come from $\texttt{budgetIV}$ with $\bm \tau = (0.6, \tau)$; dark blue from the $L_1$-norm constraint $\lVert \bm \gamma_{\bm g}\rVert_1 \leq \tau + 0.6$; and light blue from the $L_2$-norm constraint $\lVert \bm \gamma_{\bm g} \rVert_2 \leq \sqrt{\tau^2 + 0.6^2}$. We vary $\tau$ linearly from $0$ to $10$---each bound is an experiment.
  • Figure 4: Budget constraints provide information about the structure of the problem. Feasible values of the ATE relative to a baseline of $x_0=0$ as exposure $X$ varies in a nonlinear SEM with $d_{\bm Z}=6$. The true ATE is given by the solid black curve. Each colored region corresponds to a unique intersection of $\bm \gamma_{\bm g}$ and the star domain $\bm \Gamma$. The union of such intersections at each value of $X$ produces a disconnected feasible set.
  • Figure 5: budgetIV captures the true causal effect when most candidates are invalid IVs. Results from a simulation study with $d_{\bm Z} = 100$ candidate IVs, $70$ of which violate (A$3$), benchmarking (Black) $95\%$ coverage of the feasible set according budgetIV_scalar, where the budget constraints $\bm \Gamma (\tau = 0, b)$ are varied along the $x$-axis. (Blue) The optimal solution set relative to the constraint $\bm \Gamma (\tau = 0.001, b)$ (for visibility) captures the true causal effect if and only if the choice of $b$ doesn't exclude the ground truth $\bm \gamma_{\bm g}^{\bm *}$. (Others) Confidence intervals for benchmark methods produce do not include $\theta^*$.
  • ...and 3 more figures

Theorems & Definitions (13)

  • Theorem 1: Identifiability
  • Definition 1: Soundness
  • Definition 2: Completeness
  • Definition 3: Minimality
  • Theorem 2: Optimal solution map
  • Definition 4: Star domain
  • Theorem 3: $t$-point identification
  • Corollary 3.1: No point identification for $d_{\bm \Phi} > 1$
  • Corollary 3.2: $t$-point identification for $d_{\Phi} = 1$
  • Theorem 4: Coverage
  • ...and 3 more