Table of Contents
Fetching ...

Variance Reduction for the Independent Metropolis Sampler

Siran Liu, Petros Dellaportas, Michalis K. Titsias

Abstract

Assume that we would like to estimate the expected value of a function $F$ with respect to an intractable density $π$, which is specified up to some unknown normalising constant. We prove that if $π$ is close enough under KL divergence to another density $q$, an independent Metropolis sampler estimator that obtains samples from $π$ with proposal density $q$, enriched with a variance reduction computational strategy based on control variates, achieves smaller asymptotic variance than i.i.d.\ sampling from $π$. The control variates construction requires no extra computational effort but assumes that the expected value of $F$ under $q$ is analytically available. We illustrate this result by calculating the marginal likelihood in a linear regression model with prior-likelihood conflict and a non-conjugate prior. Furthermore, we propose an adaptive independent Metropolis algorithm that adapts the proposal density such that its KL divergence with the target is being reduced. We demonstrate its applicability in a Bayesian logistic and Gaussian process regression problems and we rigorously justify our asymptotic arguments under easily verifiable and essentially minimal conditions.

Variance Reduction for the Independent Metropolis Sampler

Abstract

Assume that we would like to estimate the expected value of a function with respect to an intractable density , which is specified up to some unknown normalising constant. We prove that if is close enough under KL divergence to another density , an independent Metropolis sampler estimator that obtains samples from with proposal density , enriched with a variance reduction computational strategy based on control variates, achieves smaller asymptotic variance than i.i.d.\ sampling from . The control variates construction requires no extra computational effort but assumes that the expected value of under is analytically available. We illustrate this result by calculating the marginal likelihood in a linear regression model with prior-likelihood conflict and a non-conjugate prior. Furthermore, we propose an adaptive independent Metropolis algorithm that adapts the proposal density such that its KL divergence with the target is being reduced. We demonstrate its applicability in a Bayesian logistic and Gaussian process regression problems and we rigorously justify our asymptotic arguments under easily verifiable and essentially minimal conditions.
Paper Structure (38 sections, 10 theorems, 99 equations, 2 figures, 7 tables, 4 algorithms)

This paper contains 38 sections, 10 theorems, 99 equations, 2 figures, 7 tables, 4 algorithms.

Key Result

Theorem 1

(Proof in Section thmp1 of the supplementary material) Assume that for a target density $\pi(x)$ there exists a sequence of proposal distributions $\left\{q_i(x)\right\}_{i=1}^\infty$ such that $\lim_{i \rightarrow \infty} q_i(x) \rightarrow \pi(x)$ and for each proposal distribution $q_i$ the corre for some constant $c$.

Figures (2)

  • Figure 1: Comparison of $\mu_{n,IMCV}$ and $\mu_{n,MC^*}$ estimators. Top row: $\mathcal{N}(x|0,1)$ target and $\mathcal{N}(x|0,\sigma^2)$ proposal. Bottom row: $\mathcal{N}(x|0,1)$ target and $t_{\nu}(y)$ proposal. (a)-(c): Boxplots of $\mu_{n,IMCV}$ based on 20 repetitions for different values of $\sigma^2$ and $\nu$. (b)-(d): The logarithm of VRFs and corresponding theoretical bounds for different values of $\sigma^2$ and $\nu$.
  • Figure 2: Boxplot of VRFs for the coordinate estimates of different dimensional Gaussian target and Gaussian proposal

Theorems & Definitions (21)

  • Theorem 1
  • Proposition 1
  • Theorem 2
  • Corollary 1
  • Theorem 3
  • proof
  • proof
  • Lemma 1
  • Lemma 2
  • proof
  • ...and 11 more