Table of Contents
Fetching ...

The Variational Gaussian Process

Dustin Tran, Rajesh Ranganath, David M. Blei

TL;DR

This paper introduces the Variational Gaussian Process (VGP), a Bayesian nonparametric variational family that learns flexible posteriors by warping latent inputs through Gaussian process mappings. The authors establish a universal approximation theorem showing the VGP can represent any finite posterior with positive density, and they develop a black-box variational objective with an auxiliary model to enable scalable inference across diverse generative models. They demonstrate state-of-the-art performance on unsupervised benchmarks such as DLGM and DRAW, particularly on binarized MNIST and Sketch datasets, underscoring the practical impact of flexible, nonparametric variational families. The work highlights the VGP’s potential as a general-purpose inference engine and hints at future roles in Monte Carlo methods and optimization landscape analysis.

Abstract

Variational inference is a powerful tool for approximate inference, and it has been recently applied for representation learning with deep generative models. We develop the variational Gaussian process (VGP), a Bayesian nonparametric variational family, which adapts its shape to match complex posterior distributions. The VGP generates approximate posterior samples by generating latent inputs and warping them through random non-linear mappings; the distribution over random mappings is learned during inference, enabling the transformed outputs to adapt to varying complexity. We prove a universal approximation theorem for the VGP, demonstrating its representative power for learning any model. For inference we present a variational objective inspired by auto-encoders and perform black box inference over a wide class of models. The VGP achieves new state-of-the-art results for unsupervised learning, inferring models such as the deep latent Gaussian model and the recently proposed DRAW.

The Variational Gaussian Process

TL;DR

This paper introduces the Variational Gaussian Process (VGP), a Bayesian nonparametric variational family that learns flexible posteriors by warping latent inputs through Gaussian process mappings. The authors establish a universal approximation theorem showing the VGP can represent any finite posterior with positive density, and they develop a black-box variational objective with an auxiliary model to enable scalable inference across diverse generative models. They demonstrate state-of-the-art performance on unsupervised benchmarks such as DLGM and DRAW, particularly on binarized MNIST and Sketch datasets, underscoring the practical impact of flexible, nonparametric variational families. The work highlights the VGP’s potential as a general-purpose inference engine and hints at future roles in Monte Carlo methods and optimization landscape analysis.

Abstract

Variational inference is a powerful tool for approximate inference, and it has been recently applied for representation learning with deep generative models. We develop the variational Gaussian process (VGP), a Bayesian nonparametric variational family, which adapts its shape to match complex posterior distributions. The VGP generates approximate posterior samples by generating latent inputs and warping them through random non-linear mappings; the distribution over random mappings is learned during inference, enabling the transformed outputs to adapt to varying complexity. We prove a universal approximation theorem for the VGP, demonstrating its representative power for learning any model. For inference we present a variational objective inspired by auto-encoders and perform black box inference over a wide class of models. The VGP achieves new state-of-the-art results for unsupervised learning, inferring models such as the deep latent Gaussian model and the recently proposed DRAW.

Paper Structure

This paper contains 21 sections, 2 theorems, 26 equations, 4 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

Let $q(\boldsymbol{\mathbf{z}};\boldsymbol{\mathbf{\theta}}, \mathcal{D})$ denote the VGP. Consider a posterior distribution $p(\boldsymbol{\mathbf{z}}\,|\,\boldsymbol{\mathbf{x}})$ with a finite number of latent variables and continuous quantile function (inverse CDF). There exists a sequence of pa

Figures (4)

  • Figure 1: (a) Graphical model of the VGP. The generates samples of latent variables $\boldsymbol{\mathbf{z}}$ by evaluating random non-linear mappings of latent inputs $\boldsymbol{\mathbf{\xi}}$, and then drawing mean-field samples parameterized by the mapping. These latent variables aim to follow the posterior distribution for a generative model (b), conditioned on data $\boldsymbol{\mathbf{x}}$.
  • Figure 2: Sequence of domain mappings during inference, from variational latent variable space $\mathcal{R}$ to posterior latent variable space $\mathcal{Q}$ to data space $\mathcal{P}$. We perform variational inference in the posterior space and auxiliary inference in the variational space.
  • Figure 2: Negative predictive log-likelihood for Sketch, learned over hundreds of epochs over all 18,000 training examples.
  • Figure 3: Generated images from with a (top), and with the original variational auto-encoder (bottom). The learns texture and sharpness, able to sketch more complex shapes.

Theorems & Definitions (3)

  • Theorem 1: Universal approximation
  • Theorem \ref{theorem:limit}
  • proof