Table of Contents
Fetching ...

Variational Gaussian Process Diffusion Processes

Prakhar Verma, Vincent Adam, Arno Solin

TL;DR

An alternative parameterization of the Gaussian variational process is proposed using a site-based exponential family description, which allows for a slow inference algorithm with fixed-point iterations for a fast algorithm for convex optimization akin to natural gradient descent, which also provides a better objective for learning model parameters.

Abstract

Diffusion processes are a class of stochastic differential equations (SDEs) providing a rich family of expressive models that arise naturally in dynamic modelling tasks. Probabilistic inference and learning under generative models with latent processes endowed with a non-linear diffusion process prior are intractable problems. We build upon work within variational inference, approximating the posterior process as a linear diffusion process, and point out pathologies in the approach. We propose an alternative parameterization of the Gaussian variational process using a site-based exponential family description. This allows us to trade a slow inference algorithm with fixed-point iterations for a fast algorithm for convex optimization akin to natural gradient descent, which also provides a better objective for learning model parameters.

Variational Gaussian Process Diffusion Processes

TL;DR

An alternative parameterization of the Gaussian variational process is proposed using a site-based exponential family description, which allows for a slow inference algorithm with fixed-point iterations for a fast algorithm for convex optimization akin to natural gradient descent, which also provides a better objective for learning model parameters.

Abstract

Diffusion processes are a class of stochastic differential equations (SDEs) providing a rich family of expressive models that arise naturally in dynamic modelling tasks. Probabilistic inference and learning under generative models with latent processes endowed with a non-linear diffusion process prior are intractable problems. We build upon work within variational inference, approximating the posterior process as a linear diffusion process, and point out pathologies in the approach. We propose an alternative parameterization of the Gaussian variational process using a site-based exponential family description. This allows us to trade a slow inference algorithm with fixed-point iterations for a fast algorithm for convex optimization akin to natural gradient descent, which also provides a better objective for learning model parameters.
Paper Structure (60 sections, 95 equations, 12 figures, 2 tables, 2 algorithms)

This paper contains 60 sections, 95 equations, 12 figures, 2 tables, 2 algorithms.

Figures (12)

  • Figure 1: Approximating $p_{\mathcal{D}}$ with $q$ for inference and learning in DPs.
  • Figure 2: Left: Sequential Monte Carlo (SMC) samples and posterior of our CVI-DP for a non-linear diffusion process with skew and emerging modes between observations (). Middle: ELBO iterations highlight faster inference with CVI-DP vs. VDP. Right: Exact log-likelihood and ELBO as function of $\theta$ for parameter learning.
  • Figure 3: Approximate inference under a Double-Well prior (draws from prior on the left). Middle: Approximate posterior processes for CVI-DP and VDP overlaid on the SMC ground-truth samples. Right: Our CVI-DP converges quickly even with large discretization step when inferring the variational parameters, while VDP suffers from slow convergence even with a small discretization step.
  • Figure 4: Approximate inference under a stochastic van der Pol oscillator prior (draws from prior on the left). Middle: Approximate posterior processes for CVI-DP and VDP overlaid on the SMC ground-truth samples. Right: CVI-DP converges quickly even with large discretization step when inferring the variational parameters, while VDP suffers from slow convergence even with a small discretization step. SMC did not converge with the provided budget and requires more number of particles in the multi-dimensional setup.
  • Figure 5: Faster learning of the Double-Well DP parameter $\theta$ (M-Step) of the proposed method CVI-DP compared to VDP.
  • ...and 7 more figures