On the Approximation Accuracy of Gaussian Variational Inference

Anya Katsevich; Philippe Rigollet

On the Approximation Accuracy of Gaussian Variational Inference

Anya Katsevich, Philippe Rigollet

TL;DR

This work bound the TV error and the mean and covariance approximation error of Gaussian VI in terms of dimension and sample size and relies on a Hermite series expansion of the log posterior whose first terms are precisely cancelled out by the first order optimality conditions associated to the Gaussian VI optimization problem.

Abstract

The main computational challenge in Bayesian inference is to compute integrals against a high-dimensional posterior distribution. In the past decades, variational inference (VI) has emerged as a tractable approximation to these integrals, and a viable alternative to the more established paradigm of Markov Chain Monte Carlo. However, little is known about the approximation accuracy of VI. In this work, we bound the TV error and the mean and covariance approximation error of Gaussian VI in terms of dimension and sample size. Our error analysis relies on a Hermite series expansion of the log posterior whose first terms are precisely cancelled out by the first order optimality conditions associated to the Gaussian VI optimization problem.

On the Approximation Accuracy of Gaussian Variational Inference

TL;DR

Abstract

Paper Structure (28 sections, 37 theorems, 275 equations, 1 figure)

This paper contains 28 sections, 37 theorems, 275 equations, 1 figure.

Introduction
Statement of Main Results
Assumptions on the potential
Main Results
Discussion
Logistic regression with Gaussian design
Checking the assumptions
Application of results in Section \ref{['sec:main']} to logistic regression
Numerical Simulation
Proof of Theorems \ref{['thm:Vgen']} and \ref{['thm:corr']}
Reduction to comparison with a standard Gaussian
Proof Outline
Dimension dependence
Key Lemmas
Existence of unique solution to stationarity conditions
...and 13 more sections

Key Result

Lemma 2.1

Let Assumptions assume:1 and assume:glob be satisfied and let $H_V = \nabla^2V(m_*)=n\nabla^2v(m_*)$. Then there exists a unique $(m,S)=(\hat{m},\, \hat{S})$ in the set which solves Moreover, $\hat{S}$ satisfies

Figures (1)

Figure 1: Gaussian VI yields a more accurate mean estimate than does Laplace, while the two covariance estimates are on the same order. Here, $\pi_n$ is the likelihood of logistic regression given $n$ observations in dimension $d=2$. For the left-hand plot, the slopes of the best-fit lines are $-1.04$ for the Laplace approximation and $-2.02$ for Gaussian VI. For covariance: the slopes of the best-fit lines are -2.09 for Laplace, -2.12 for VI.

Theorems & Definitions (71)

Lemma 2.1
Theorem 2.1
Theorem 2.2: Leading order term in VI approximation error
Remark 2.1
Corollary 2.1: TV error
Corollary 2.2: Mean and covariance error
Remark 2.2
Lemma 2.2
Lemma 2.3
Lemma 3.1: Corollary F.6 in katsBVM with $A=I_d$
...and 61 more

On the Approximation Accuracy of Gaussian Variational Inference

TL;DR

Abstract

On the Approximation Accuracy of Gaussian Variational Inference

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (71)