Table of Contents
Fetching ...

An Order of Magnitude Time Complexity Reduction for Gaussian Graphical Model Posterior Sampling Using a Reverse Telescoping Block Decomposition

Zejin Gao, Ksheera Sagar, Anindya Bhadra

Abstract

We consider the problem of fully Bayesian posterior estimation and uncertainty quantification in undirected Gaussian graphical models via Markov chain Monte Carlo (MCMC) under recently-developed element-wise graphical priors, such as the graphical horseshoe. Unlike the conjugate Wishart family, these priors are non-conjugate; but have the advantage that they naturally allow one to encode a prior belief of sparsity in the off-diagonal elements of the precision matrix, without imposing a structure on the entire matrix. Unfortunately, for a graph with $p$ nodes and with $n$ samples, the state-of-the-art MCMC approaches for the element-wise priors achieve a per iteration complexity of ${O}(p^4),$ which is prohibitive when $p\gg n$. In this regime, we develop a suitably reparameterized MCMC with per iteration complexity of ${O}(p^3)$, providing a one order of magnitude improvement, and consequently bringing the per iteration computational cost at par with the conjugate Wishart family, which is also ${O}(p^3)$ due to a use of the classical Bartlett decomposition, but this decomposition does not apply outside the Wishart family. Importantly, the proposed benefit is obtained solely due to our reparameterization in an MCMC scheme targeting the true posterior, that reverses the recently developed telescoping block decomposition of Bhadra et al. (2024), in a suitable sense. There is no variational or any other approximate Bayesian computation scheme considered in this paper that compromises targeting the true posterior. Simulations and the analysis of a breast cancer data set confirm both the correctness and better algorithmic scaling of the proposed reverse telescoping sampler.

An Order of Magnitude Time Complexity Reduction for Gaussian Graphical Model Posterior Sampling Using a Reverse Telescoping Block Decomposition

Abstract

We consider the problem of fully Bayesian posterior estimation and uncertainty quantification in undirected Gaussian graphical models via Markov chain Monte Carlo (MCMC) under recently-developed element-wise graphical priors, such as the graphical horseshoe. Unlike the conjugate Wishart family, these priors are non-conjugate; but have the advantage that they naturally allow one to encode a prior belief of sparsity in the off-diagonal elements of the precision matrix, without imposing a structure on the entire matrix. Unfortunately, for a graph with nodes and with samples, the state-of-the-art MCMC approaches for the element-wise priors achieve a per iteration complexity of which is prohibitive when . In this regime, we develop a suitably reparameterized MCMC with per iteration complexity of , providing a one order of magnitude improvement, and consequently bringing the per iteration computational cost at par with the conjugate Wishart family, which is also due to a use of the classical Bartlett decomposition, but this decomposition does not apply outside the Wishart family. Importantly, the proposed benefit is obtained solely due to our reparameterization in an MCMC scheme targeting the true posterior, that reverses the recently developed telescoping block decomposition of Bhadra et al. (2024), in a suitable sense. There is no variational or any other approximate Bayesian computation scheme considered in this paper that compromises targeting the true posterior. Simulations and the analysis of a breast cancer data set confirm both the correctness and better algorithmic scaling of the proposed reverse telescoping sampler.

Paper Structure

This paper contains 28 sections, 3 theorems, 39 equations, 11 figures, 11 tables, 3 algorithms.

Key Result

Proposition 1

Under the joint prior on $\boldsymbol\Theta$ according to Equations eq:prior1--eq:prior, we have: $f(\theta_{j j}\mid\boldsymbol{\theta}_{j+1},\ldots,\boldsymbol{\theta}_{p}) \propto 1$ and $\boldsymbol{\theta}_{\bullet{j}}\mid\boldsymbol{\theta}_{j+1},\ldots,\boldsymbol{\theta}_{p}\sim \mathcal{N}(

Figures (11)

  • Figure 1: Average raw runtimes in minutes (left panel) and runtimes relative to $p=100$ (right panel), for the GHS prior under the hubs structure. Each MCMC run uses $10{,}000$ iterations with the first $5{,}000$ discarded as burn-in. Results are averaged over 50 replications for each $p \in \{100,200,300,400,600,800\}$, with $n = 24 \log p$. At $p>400$, the cyclical sampler exceeded the 12-hour runtime limit. The RT sampler is implemented in Rcpp and the cyclical sampler in R.
  • Figure 2: Average raw runtimes in minutes (left panel) and runtimes relative to $p=100$ (right panel), for the GHSL prior under the hubs structure. Each MCMC run uses $10{,}000$ iterations with the first $5{,}000$ discarded as burn-in. Results are averaged over 50 replications for each $p \in \{100,200,300,400,600,800\}$, with $n = 24 \log p$. At $p>400$, the cyclical sampler exceeded the 12-hour runtime limit. The RT sampler is implemented in Rcpp and the cyclical sampler in R.
  • Figure 3: Inferred gene-interaction graphs under two priors (GHS, GHSL) and two sampling methods (Cyclical, RT) for the breast cancer RNA sequencing data ($p = 139, n = 90$): (a) GHS--Cyclical, (b) GHS--RT, (c) GHSL--Cyclical, (d) GHSL--RT. Isolated genes (degree 0) are omitted. Node size is proportional to degree, and node positions are shared across panels for comparability. MCMC run uses $5{,}000$ iterations with the first $1{,}000$ discarded as burn-in.
  • Figure S.1: Average raw runtimes in seconds (left panel) and runtimes relative to $p=100$ (right panel), for the GHS prior under the tridiagonal structure. Each MCMC run uses $10{,}000$ iterations with the first $5{,}000$ discarded as burn-in. Results are averaged over 50 replications for each $p \in \{100,200,300,400,600,800\}$, with $n = 24 \log p$. At $p>400$, the cyclical sampler exceeded the 12-hour wall time limit, so its raw runtime is capped at 12 hours. The RT sampler is implemented in Rcpp and the Cyclical sampler in R.
  • Figure S.2: Average raw runtimes in seconds (left panel) and runtimes relative to $p=100$ (right panel), for the GHS prior under the cliques negative structure. Each MCMC run uses $10{,}000$ iterations with the first $5{,}000$ discarded as burn-in. Results are averaged over 50 replications for each $p \in \{100,200,300,400,600,800\}$, with $n = 24 \log p$. At $p>400$, the cyclical sampler exceeded the 12-hour wall time limit, so its raw runtime is capped at 12 hours. The RT sampler is implemented in Rcpp and the Cyclical sampler in R.
  • ...and 6 more figures

Theorems & Definitions (7)

  • Remark 1: Maintaining positive definiteness of $\boldsymbol\Theta$
  • Remark 2: The notations $\boldsymbol{\Theta}$ versus $\boldsymbol{\Omega}$
  • Remark 3: Differences with pseudolikelihood and neighborhood selection
  • Remark 4: Differences with Cholesky factor-based partial regression
  • Proposition 1: Induced priors
  • Proposition 2: Full conditional posterior for diagonal terms
  • Proposition 3: Full conditional posterior for off-diagonal terms