Table of Contents
Fetching ...

Identification and Estimation of the Bi-Directional MR with Some Invalid Instruments

Feng Xie, Zhen Yao, Lin Xie, Yan Zeng, Zhi Geng

TL;DR

This work addresses the identifiability and estimation challenges of bi-directional Mendelian randomization with potentially invalid instruments and unmeasured confounding in a one-sample setting. It introduces a pseudo-residual framework and proves necessary and sufficient conditions for identifying valid IV sets, the causal directions, and the bidirectional causal effects, under mild assumptions. Building on these results, the authors propose PReBiM, a two-stage cluster fusion–like algorithm that (i) discovers valid IV sets, (ii) infers the causal direction for each set, and (iii) estimates $\beta_{X \to Y}$ and $\beta_{Y \to X}$ with TSLS, with asymptotic correctness guaranteed. Extensive simulations show that PReBiM outperforms existing methods in bi-directional MR, including scenarios with dependent valid IVs, highlighting its practical utility for causal inference under complex IV validity patterns.

Abstract

We consider the challenging problem of estimating causal effects from purely observational data in the bi-directional Mendelian randomization (MR), where some invalid instruments, as well as unmeasured confounding, usually exist. To address this problem, most existing methods attempt to find proper valid instrumental variables (IVs) for the target causal effect by expert knowledge or by assuming that the causal model is a one-directional MR model. As such, in this paper, we first theoretically investigate the identification of the bi-directional MR from observational data. In particular, we provide necessary and sufficient conditions under which valid IV sets are correctly identified such that the bi-directional MR model is identifiable, including the causal directions of a pair of phenotypes (i.e., the treatment and outcome). Moreover, based on the identification theory, we develop a cluster fusion-like method to discover valid IV sets and estimate the causal effects of interest. We theoretically demonstrate the correctness of the proposed algorithm. Experimental results show the effectiveness of our method for estimating causal effects in bi-directional MR.

Identification and Estimation of the Bi-Directional MR with Some Invalid Instruments

TL;DR

This work addresses the identifiability and estimation challenges of bi-directional Mendelian randomization with potentially invalid instruments and unmeasured confounding in a one-sample setting. It introduces a pseudo-residual framework and proves necessary and sufficient conditions for identifying valid IV sets, the causal directions, and the bidirectional causal effects, under mild assumptions. Building on these results, the authors propose PReBiM, a two-stage cluster fusion–like algorithm that (i) discovers valid IV sets, (ii) infers the causal direction for each set, and (iii) estimates and with TSLS, with asymptotic correctness guaranteed. Extensive simulations show that PReBiM outperforms existing methods in bi-directional MR, including scenarios with dependent valid IVs, highlighting its practical utility for causal inference under complex IV validity patterns.

Abstract

We consider the challenging problem of estimating causal effects from purely observational data in the bi-directional Mendelian randomization (MR), where some invalid instruments, as well as unmeasured confounding, usually exist. To address this problem, most existing methods attempt to find proper valid instrumental variables (IVs) for the target causal effect by expert knowledge or by assuming that the causal model is a one-directional MR model. As such, in this paper, we first theoretically investigate the identification of the bi-directional MR from observational data. In particular, we provide necessary and sufficient conditions under which valid IV sets are correctly identified such that the bi-directional MR model is identifiable, including the causal directions of a pair of phenotypes (i.e., the treatment and outcome). Moreover, based on the identification theory, we develop a cluster fusion-like method to discover valid IV sets and estimate the causal effects of interest. We theoretically demonstrate the correctness of the proposed algorithm. Experimental results show the effectiveness of our method for estimating causal effects in bi-directional MR.
Paper Structure (38 sections, 6 theorems, 57 equations, 7 figures, 6 tables, 5 algorithms)

This paper contains 38 sections, 6 theorems, 57 equations, 7 figures, 6 tables, 5 algorithms.

Key Result

Proposition 1

Assume the system is a linear bi-directional causal model eq-model-XY. For a given causal relationship $X \to Y$ in the system, the causal effect of $X$ on $Y$ can be identified by where $\mathbf{P}= (\mathbf{G}_{\mathcal{V}}^{X \to Y})^{\intercal} \left[ \mathbf{G}_{\mathcal{V}}^{X \to Y} (\mathbf{G}_{\mathcal{V}}^{X \to Y})^{\intercal} \right]^{-1} \mathbf{G}_{\mathcal{V}}^{X \to Y}$ is the pro

Figures (7)

  • Figure 1: Graphical illustration of a valid IV model, where dashed lines indicate the absence of arrows. ${G}$ is a valid IV relative to the causal relationship $X \to Y$.
  • Figure 2: An illustrative example where valid and invalid IV sets induce distinct constraints, where $\mathbf{G}_{\mathcal{V}}^{X \to Y} = (G_1, G_3)^{\intercal}$ is a valid IV set, while $\mathbf{G}_{\mathcal{I}}^{X \to Y} = (G_2, G_4, G_5)^{\intercal}$ is invalid due to pathways $G_2 \to Y$, $G_4 \to Y$ and $G_5 \to Y$.
  • Figure 3: An illustrative example that valid and invalid IV sets may induce the same constraints in Eq.\ref{['eq-pro-1']} of Proposition \ref{['proposition-select-invalid-IVs']}, where $\{{{G}}_1, {{G}}_2\}$ is the set of valid IVs in (a), while invalid in (b). It implies Assumption \ref{['Assumption-Two-Valid-IVs']} is not sufficient to find valid IV sets.
  • Figure 4: Simple examples, where $\{G_1,G_2\}$ is a valid IV set in (a), whereas invalid in (b).
  • Figure 5: Performance comparison of sisVIVE, IV-TETRAD, TSHT, and PReBiM in estimating one-directional MR models across various sample sizes and three scenarios.
  • ...and 2 more figures

Theorems & Definitions (17)

  • Proposition 1: Two Stage Least Square (TSLS) Estimator
  • Remark 1
  • Definition 1
  • Proposition 2: Identifying Invalid IV Sets
  • Example 1
  • Example 2: Counterexample
  • Proposition 3: Identifying Valid IV Sets
  • Proposition 4: Identifying Direction of Causal Influences
  • Theorem 1: Identifiability of Bi-directional MR model
  • Theorem 2: Correctness
  • ...and 7 more