Table of Contents
Fetching ...

On the Approximation Accuracy of Gaussian Variational Inference

Anya Katsevich, Philippe Rigollet

TL;DR

This work bound the TV error and the mean and covariance approximation error of Gaussian VI in terms of dimension and sample size and relies on a Hermite series expansion of the log posterior whose first terms are precisely cancelled out by the first order optimality conditions associated to the Gaussian VI optimization problem.

Abstract

The main computational challenge in Bayesian inference is to compute integrals against a high-dimensional posterior distribution. In the past decades, variational inference (VI) has emerged as a tractable approximation to these integrals, and a viable alternative to the more established paradigm of Markov Chain Monte Carlo. However, little is known about the approximation accuracy of VI. In this work, we bound the TV error and the mean and covariance approximation error of Gaussian VI in terms of dimension and sample size. Our error analysis relies on a Hermite series expansion of the log posterior whose first terms are precisely cancelled out by the first order optimality conditions associated to the Gaussian VI optimization problem.

On the Approximation Accuracy of Gaussian Variational Inference

TL;DR

This work bound the TV error and the mean and covariance approximation error of Gaussian VI in terms of dimension and sample size and relies on a Hermite series expansion of the log posterior whose first terms are precisely cancelled out by the first order optimality conditions associated to the Gaussian VI optimization problem.

Abstract

The main computational challenge in Bayesian inference is to compute integrals against a high-dimensional posterior distribution. In the past decades, variational inference (VI) has emerged as a tractable approximation to these integrals, and a viable alternative to the more established paradigm of Markov Chain Monte Carlo. However, little is known about the approximation accuracy of VI. In this work, we bound the TV error and the mean and covariance approximation error of Gaussian VI in terms of dimension and sample size. Our error analysis relies on a Hermite series expansion of the log posterior whose first terms are precisely cancelled out by the first order optimality conditions associated to the Gaussian VI optimization problem.
Paper Structure (28 sections, 37 theorems, 275 equations, 1 figure)

This paper contains 28 sections, 37 theorems, 275 equations, 1 figure.

Key Result

Lemma 2.1

Let Assumptions assume:1 and assume:glob be satisfied and let $H_V = \nabla^2V(m_*)=n\nabla^2v(m_*)$. Then there exists a unique $(m,S)=(\hat{m},\, \hat{S})$ in the set which solves Moreover, $\hat{S}$ satisfies

Figures (1)

  • Figure 1: Gaussian VI yields a more accurate mean estimate than does Laplace, while the two covariance estimates are on the same order. Here, $\pi_n$ is the likelihood of logistic regression given $n$ observations in dimension $d=2$. For the left-hand plot, the slopes of the best-fit lines are $-1.04$ for the Laplace approximation and $-2.02$ for Gaussian VI. For covariance: the slopes of the best-fit lines are -2.09 for Laplace, -2.12 for VI.

Theorems & Definitions (71)

  • Lemma 2.1
  • Theorem 2.1
  • Theorem 2.2: Leading order term in VI approximation error
  • Remark 2.1
  • Corollary 2.1: TV error
  • Corollary 2.2: Mean and covariance error
  • Remark 2.2
  • Lemma 2.2
  • Lemma 2.3
  • Lemma 3.1: Corollary F.6 in katsBVM with $A=I_d$
  • ...and 61 more