Table of Contents
Fetching ...

Manifold Gaussian Variational Bayes on the Precision Matrix

Martin Magris, Mostafa Shabani, Alexandros Iosifidis

TL;DR

The manifold gaussian variational Bayes on the precision matrix (MGVBP) solution provides simple update rules, is straightforward to implement, and the use of the precision matrix parameterization has a significant computational advantage.

Abstract

We propose an optimization algorithm for Variational Inference (VI) in complex models. Our approach relies on natural gradient updates where the variational space is a Riemann manifold. We develop an efficient algorithm for Gaussian Variational Inference whose updates satisfy the positive definite constraint on the variational covariance matrix. Our Manifold Gaussian Variational Bayes on the Precision matrix (MGVBP) solution provides simple update rules, is straightforward to implement, and the use of the precision matrix parametrization has a significant computational advantage. Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models. Over five datasets, we empirically validate our feasible approach on different statistical and econometric models, discussing its performance with respect to baseline methods.

Manifold Gaussian Variational Bayes on the Precision Matrix

TL;DR

The manifold gaussian variational Bayes on the precision matrix (MGVBP) solution provides simple update rules, is straightforward to implement, and the use of the precision matrix parameterization has a significant computational advantage.

Abstract

We propose an optimization algorithm for Variational Inference (VI) in complex models. Our approach relies on natural gradient updates where the variational space is a Riemann manifold. We develop an efficient algorithm for Gaussian Variational Inference whose updates satisfy the positive definite constraint on the variational covariance matrix. Our Manifold Gaussian Variational Bayes on the Precision matrix (MGVBP) solution provides simple update rules, is straightforward to implement, and the use of the precision matrix parametrization has a significant computational advantage. Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models. Over five datasets, we empirically validate our feasible approach on different statistical and econometric models, discussing its performance with respect to baseline methods.
Paper Structure (37 sections, 1 theorem, 48 equations, 11 figures, 19 tables, 2 algorithms)

This paper contains 37 sections, 1 theorem, 48 equations, 11 figures, 19 tables, 2 algorithms.

Key Result

Proposition 5.1

For a $d$-dimensional Gaussian variational posterior whose mean is denoted by ${\bm{\mu}}$ and covariance matrix by $\Sigma$, consider the following two parameterizations: the canonical parameterization ${\bm{\zeta^c}} = (*){{\bm{\mu}}^\top,\text{vec}(*){\Sigma}^\top}^\top$ and the inverse parametri

Figures (11)

  • Figure 1: Manifold illustration. Left: manifold (black), tangent space (light blue), and Riemann gradient at the point in black. Middle: exponential map (dotted gray) and the corresponding point on the manifold (green point). Right: Parallel transform between vectors on two tangent planes.
  • Figure 2: Variational inference for the simple linear regression model, with 100 observations simulated according to $Y = 2X + \bm{\varepsilon}$, $\varepsilon_i \sim \mathcal{N}(*){0,1}$, $X = [0,0.05,0.1,\dots, 5]$, for the MGVB update and the revisited version (MGVB rev. label) applying retraction on the Riemannian gradient in $\mathcal{M}$. The red circle denotes the true posterior parameter computed with standard results in Bayesian linear regression.
  • Figure 3: Logistic regression. Left: dynamics of the lower bound across the iterations. Center and right: marginal posteriors for the mode internet and coefficient of the first regressor for MGVBP, MGBV. A kernel density from MCMC samples and the ML solutions are overlaid.
  • Figure 4: Logistic regression. Dynamics of the performance metrics across the iterations on the train and test set for the MGVBP and MGVB optimizers.
  • Figure 5: Parameter learning across the iterations for the MGVBP algorithm on the Labour dataset for some selected variational parameters. Dotted lines correspond to the diagonal case, dashed lines to the use of the $h$-function gradient estimator.
  • ...and 6 more figures

Theorems & Definitions (1)

  • Proposition 5.1