Table of Contents
Fetching ...

Nested Sampling with Slice-within-Gibbs: Efficient Evidence Calculation for Hierarchical Bayesian Models

David Yallup

Abstract

We present Nested Sampling with Slice-within-Gibbs (NS-SwiG), an algorithm for Bayesian inference and evidence estimation in high-dimensional models whose likelihood admits a factorization, such as hierarchical Bayesian models. We construct a procedure to sample from the likelihood-constrained prior using a Slice-within-Gibbs kernel: an outer update of hyperparameters followed by inner block updates over local parameters. A likelihood-budget decomposition caches per-block contributions so that each local update checks feasibility in constant time rather than recomputing the global constraint at linearly growing cost. This reduces the per-replacement cost from quadratic to linear in the number of groups, and the overall algorithmic complexity from cubic to quadratic under standard assumptions. The decomposition extends naturally beyond independent observations, and we demonstrate this on Markov-structured latent variables. We evaluate NS-SwiG on challenging benchmarks, demonstrating scalability to thousands of dimensions and accurate evidence estimates even on posterior geometries where state-of-the-art gradient-based samplers can struggle.

Nested Sampling with Slice-within-Gibbs: Efficient Evidence Calculation for Hierarchical Bayesian Models

Abstract

We present Nested Sampling with Slice-within-Gibbs (NS-SwiG), an algorithm for Bayesian inference and evidence estimation in high-dimensional models whose likelihood admits a factorization, such as hierarchical Bayesian models. We construct a procedure to sample from the likelihood-constrained prior using a Slice-within-Gibbs kernel: an outer update of hyperparameters followed by inner block updates over local parameters. A likelihood-budget decomposition caches per-block contributions so that each local update checks feasibility in constant time rather than recomputing the global constraint at linearly growing cost. This reduces the per-replacement cost from quadratic to linear in the number of groups, and the overall algorithmic complexity from cubic to quadratic under standard assumptions. The decomposition extends naturally beyond independent observations, and we demonstrate this on Markov-structured latent variables. We evaluate NS-SwiG on challenging benchmarks, demonstrating scalability to thousands of dimensions and accurate evidence estimates even on posterior geometries where state-of-the-art gradient-based samplers can struggle.
Paper Structure (38 sections, 18 equations, 6 figures, 7 tables, 2 algorithms)

This paper contains 38 sections, 18 equations, 6 figures, 7 tables, 2 algorithms.

Figures (6)

  • Figure 1: Hierarchical Gaussian model scaling. Top row (a--c): NS-SwiG vs. NSS for $J \in \{10, 50, 100, 250\}$. Bottom row (d--f): NS-SwiG extended scaling to $J \in \{100, 500, 1000\}$.
  • Figure 2: Neal's funnel (centered parameterization). (a,b) $(\psi, \theta_0)$ marginal against analytic contours (grey dashed). (c) ESS per evaluation for both parameterizations.
  • Figure 3: ESS per likelihood evaluation across benchmarks.
  • Figure 4: Hierarchical Gaussian model: 2D marginal posteriors for $(\psi, \theta_0)$.
  • Figure 5: Radon contextual effects model: hyperparameter marginals for centered (left three panels) and non-centered (right three panels) parameterizations. NUTS non-centered (black dashed) serves as reference. NS-SwiG (red), NUTS centered (green), and SMC-HMC (orange) all produce consistent marginals.
  • ...and 1 more figures