Table of Contents
Fetching ...

Linear-cost unbiased posterior estimates for crossed effects and matrix factorization models via couplings

Paolo Maria Ceriani, Andrea Pandolfi, Giacomo Zanella

TL;DR

This work designs and analyzes unbiased Markov chain Monte Carlo schemes based on couplings of blocked Gibbs samplers, and provides unbiased posterior estimates at linear cost, matching state-of-the-art procedures for both frequentist and Bayesian estimation of those models.

Abstract

We design and analyze unbiased Markov chain Monte Carlo (MCMC) schemes based on couplings of blocked Gibbs samplers (BGSs), whose total computational costs scale linearly with the number of parameters and data points. Our methodology is designed for and applicable to high-dimensional BGS with conditionally independent blocks, which are often encountered in Bayesian modeling. We provide bounds on the expected number of iterations needed for coalescence for Gaussian targets, as well as on the tails of the coalescence times distribution. These imply that practical two-step coupling strategies achieve coalescence times that match the relaxation times of the original BGS scheme up to logarithmic factors. To illustrate the practical relevance of our methodology, we apply it to high-dimensional crossed random effect and probabilistic matrix factorization models, for which we develop a novel BGS scheme with improved convergence speed. Our methodology provides unbiased posterior estimates at linear cost (usually requiring only a few BGS iterations for problems with thousands of parameters), matching state-of-the-art procedures for both frequentist and Bayesian estimation of those models.

Linear-cost unbiased posterior estimates for crossed effects and matrix factorization models via couplings

TL;DR

This work designs and analyzes unbiased Markov chain Monte Carlo schemes based on couplings of blocked Gibbs samplers, and provides unbiased posterior estimates at linear cost, matching state-of-the-art procedures for both frequentist and Bayesian estimation of those models.

Abstract

We design and analyze unbiased Markov chain Monte Carlo (MCMC) schemes based on couplings of blocked Gibbs samplers (BGSs), whose total computational costs scale linearly with the number of parameters and data points. Our methodology is designed for and applicable to high-dimensional BGS with conditionally independent blocks, which are often encountered in Bayesian modeling. We provide bounds on the expected number of iterations needed for coalescence for Gaussian targets, as well as on the tails of the coalescence times distribution. These imply that practical two-step coupling strategies achieve coalescence times that match the relaxation times of the original BGS scheme up to logarithmic factors. To illustrate the practical relevance of our methodology, we apply it to high-dimensional crossed random effect and probabilistic matrix factorization models, for which we develop a novel BGS scheme with improved convergence speed. Our methodology provides unbiased posterior estimates at linear cost (usually requiring only a few BGS iterations for problems with thousands of parameters), matching state-of-the-art procedures for both frequentist and Bayesian estimation of those models.

Paper Structure

This paper contains 49 sections, 17 theorems, 121 equations, 10 figures, 1 table, 10 algorithms.

Key Result

Lemma 1

Let $\pi=N(\boldsymbol{\mu},\Sigma)$ and $\bar{P}^{c}$ as in eq:coupled_kernels_c, with $s=K$ and $(k_1,\dots,k_K)$ being a permutation of $(1,\dots,K)$. Then for all integers $n \ge 1$ it holds $\left(\bar{P}^{c}\right)^n \in \Gamma_{W_2}[P^n]$.

Figures (10)

  • Figure 1: Average meeting times and estimated bounds in a log-log scale for $K=2$, $I_1 = I_2$, $\tau_0=\tau_1 = \tau_2 =1$, Regime \ref{['reg1']}. Left: Algorithm \ref{['alg:cg']}, right: vanilla BGS.
  • Figure 2: Average meeting times and estimated bounds in a log-log scale for $K=4$, $I_1= ... = I_4$, $\tau_k = 1$ for $k\in\{0,1,2,3,4\}$. Regime \ref{['reg1']} (left), Regime \ref{['reg2']} (right), Algorithm \ref{['alg:cg']}.
  • Figure 3: Model \ref{['ex:ngcrem']} with Laplace response, $K=2$ and $I_1 = I_2$ (left), and $K=3$ and $I_1<I_2<I_3$ (right). Average meeting times in log-log scale, for Algorithm \ref{['alg:non_gauss']} obtained with unknown $\boldsymbol{\tau}$ and for different number of Metropolis steps $S$.
  • Figure 4: Average meeting times for Probabilistic Matrix Factorization model. $I_1= I_2=I \in\{100,200,500,1000 \}$. Left: Regime \ref{['reg1']}, right: Regime \ref{['reg2']}.
  • Figure 5: Estimated meeting times and bounds for $K=2$, $I_1 = I_2$, $\tau_0=\tau_1 = \tau_2 =1$, Regime \ref{['reg2']}. Left: Algorithm \ref{['alg:cg']}, right: vanilla algorithm.
  • ...and 5 more figures

Theorems & Definitions (33)

  • Lemma 1: Optimality of composition of $W_2$ couplings for Gaussians
  • Lemma 2
  • Theorem 1: Bound for reversible chains
  • Lemma 3
  • Corollary 1
  • Theorem 2
  • Corollary 2
  • Remark 1
  • Remark 2: Couplings of Metropolis-Hastings
  • Remark 3: Related literature on Bayesian factor models
  • ...and 23 more