Table of Contents
Fetching ...

Multivariable Bidirectional Mendelian Randomization via Bayesian Directed Cyclic Graphical Models with Correlated Errors

Bitan Sarkar, Yuchao Jiang, Tian Ge, Yang Ni

TL;DR

This work tackles the challenge of inferring causal networks in biology when feedback loops and unmeasured confounding obscure direct relationships. It introduces MR.RGM, a Bayesian multivariable Mendelian randomization framework that supports directed cycles and correlated errors, operating with summary-level data and leveraging spike-and-slab priors for sparse graph learning. Across simulations and two real genomic datasets (GTEx skeletal muscle and OneK1K B cells), MR.RGM demonstrates superior graph recovery and edge-level causal inference, with MR.RGM+ robust to horizontal pleiotropy and capable of identifying latent confounding modules and higher-order motifs. The method provides principled uncertainty quantification for edges, effects, and network motifs, and is accessible as an R package for scalable network-wide MR analysis in genomics.

Abstract

Mendelian randomization (MR) is a pivotal tool in genetics, genomics, and epidemiology, leveraging genetic variants as instrumental variables to infer causal relationships between exposures and outcomes. Traditional MR methods, while powerful, often rely on stringent assumptions such as the absence of feedback loops, which are frequently violated in complex biological networks. In addition, many popular MR approaches focus on only two variables (i.e., one exposure and one outcome), whereas our motivating applications of gene regulatory networks have many variables. In this article, we introduce a novel Bayesian framework for multivariable MR that concurrently addresses unmeasured confounding and feedback loops. Central to our approach is a sparse conditional cyclic graphical model with a sparse error variance-covariance matrix. Two structural priors are employed to enable the modeling and inference of causal relationships as well as latent confounding structures. Our method is designed to operate effectively with summary-level data, facilitating its application in contexts where individual-level data are inaccessible, e.g., due to privacy concerns. It can also account for horizontal pleiotropy, under which we establish the sufficient identifiability conditions. Through extensive simulations and applications to the GTEx and OneK1K data, we demonstrate the superior performance of our approach in recovering biologically plausible causal relationships in the presence of possible feedback loops and unmeasured confounding. Using posterior samples, we further quantify uncertainty in inferred network motifs by computing their posterior probabilities. The R package MR.RGM that implements the proposed method is available on CRAN (https://cran.r-project.org/package=MR.RGM).

Multivariable Bidirectional Mendelian Randomization via Bayesian Directed Cyclic Graphical Models with Correlated Errors

TL;DR

This work tackles the challenge of inferring causal networks in biology when feedback loops and unmeasured confounding obscure direct relationships. It introduces MR.RGM, a Bayesian multivariable Mendelian randomization framework that supports directed cycles and correlated errors, operating with summary-level data and leveraging spike-and-slab priors for sparse graph learning. Across simulations and two real genomic datasets (GTEx skeletal muscle and OneK1K B cells), MR.RGM demonstrates superior graph recovery and edge-level causal inference, with MR.RGM+ robust to horizontal pleiotropy and capable of identifying latent confounding modules and higher-order motifs. The method provides principled uncertainty quantification for edges, effects, and network motifs, and is accessible as an R package for scalable network-wide MR analysis in genomics.

Abstract

Mendelian randomization (MR) is a pivotal tool in genetics, genomics, and epidemiology, leveraging genetic variants as instrumental variables to infer causal relationships between exposures and outcomes. Traditional MR methods, while powerful, often rely on stringent assumptions such as the absence of feedback loops, which are frequently violated in complex biological networks. In addition, many popular MR approaches focus on only two variables (i.e., one exposure and one outcome), whereas our motivating applications of gene regulatory networks have many variables. In this article, we introduce a novel Bayesian framework for multivariable MR that concurrently addresses unmeasured confounding and feedback loops. Central to our approach is a sparse conditional cyclic graphical model with a sparse error variance-covariance matrix. Two structural priors are employed to enable the modeling and inference of causal relationships as well as latent confounding structures. Our method is designed to operate effectively with summary-level data, facilitating its application in contexts where individual-level data are inaccessible, e.g., due to privacy concerns. It can also account for horizontal pleiotropy, under which we establish the sufficient identifiability conditions. Through extensive simulations and applications to the GTEx and OneK1K data, we demonstrate the superior performance of our approach in recovering biologically plausible causal relationships in the presence of possible feedback loops and unmeasured confounding. Using posterior samples, we further quantify uncertainty in inferred network motifs by computing their posterior probabilities. The R package MR.RGM that implements the proposed method is available on CRAN (https://cran.r-project.org/package=MR.RGM).

Paper Structure

This paper contains 21 sections, 1 theorem, 21 equations, 12 figures, 6 tables.

Key Result

Theorem 1

Under Assumptions cond:diversity-cond:indep, $(\mathbf{A},\mathbf{B},\boldsymbol{\Sigma}^*)$ is identifiable.

Figures (12)

  • Figure 1: Conceptual comparison of Mendelian randomization frameworks. (a)–(c) require pre-specified exposure and outcome roles. (a) and (c) do not allow feedback loops. (d) the proposed MR.RGM supports feedback loops, while allowing all traits to be modeled simultaneously without pre-specifying causal direction. Horizontal pleiotropy and unmeasured confounding are handled with principled approach as well.
  • Figure 2: Overview of the proposed MR.RGM workflow. The method can take either individual-level data $(\mathbf{Y, X, U})$ or summary-level second-moment statistics plus sample size $n$ as input. These are plugged into a structural equation model with directed cycles and correlated errors. Spike-and-slab priors and MCMC are then used to infer the causal graph, the latent confounding structure, and the causal effects, along with their corresponding uncertainties.
  • Figure 3: Scale-free network with feedback loops and unmeasured confounding, with network size $p{=}10$. (a) AUC for graph recovery; (b) MAD for causal effect estimation.
  • Figure 4: Small-world network with feedback loops and unmeasured confounding, with network size $p{=}10$. (a) AUC for graph recovery; (b) MAD for causal effect estimation.
  • Figure 5: Confounding structure recovery performance using MR.RGM under feedback loops and unmeasured confounding, with network size $p=10$. (a) Scale-free; (b) Small-world.
  • ...and 7 more figures

Theorems & Definitions (1)

  • Theorem 1: Identifiability of MR.RGM+