Multivariable Bidirectional Mendelian Randomization via Bayesian Directed Cyclic Graphical Models with Correlated Errors
Bitan Sarkar, Yuchao Jiang, Tian Ge, Yang Ni
TL;DR
This work tackles the challenge of inferring causal networks in biology when feedback loops and unmeasured confounding obscure direct relationships. It introduces MR.RGM, a Bayesian multivariable Mendelian randomization framework that supports directed cycles and correlated errors, operating with summary-level data and leveraging spike-and-slab priors for sparse graph learning. Across simulations and two real genomic datasets (GTEx skeletal muscle and OneK1K B cells), MR.RGM demonstrates superior graph recovery and edge-level causal inference, with MR.RGM+ robust to horizontal pleiotropy and capable of identifying latent confounding modules and higher-order motifs. The method provides principled uncertainty quantification for edges, effects, and network motifs, and is accessible as an R package for scalable network-wide MR analysis in genomics.
Abstract
Mendelian randomization (MR) is a pivotal tool in genetics, genomics, and epidemiology, leveraging genetic variants as instrumental variables to infer causal relationships between exposures and outcomes. Traditional MR methods, while powerful, often rely on stringent assumptions such as the absence of feedback loops, which are frequently violated in complex biological networks. In addition, many popular MR approaches focus on only two variables (i.e., one exposure and one outcome), whereas our motivating applications of gene regulatory networks have many variables. In this article, we introduce a novel Bayesian framework for multivariable MR that concurrently addresses unmeasured confounding and feedback loops. Central to our approach is a sparse conditional cyclic graphical model with a sparse error variance-covariance matrix. Two structural priors are employed to enable the modeling and inference of causal relationships as well as latent confounding structures. Our method is designed to operate effectively with summary-level data, facilitating its application in contexts where individual-level data are inaccessible, e.g., due to privacy concerns. It can also account for horizontal pleiotropy, under which we establish the sufficient identifiability conditions. Through extensive simulations and applications to the GTEx and OneK1K data, we demonstrate the superior performance of our approach in recovering biologically plausible causal relationships in the presence of possible feedback loops and unmeasured confounding. Using posterior samples, we further quantify uncertainty in inferred network motifs by computing their posterior probabilities. The R package MR.RGM that implements the proposed method is available on CRAN (https://cran.r-project.org/package=MR.RGM).
