Table of Contents
Fetching ...

Advances in Variational Inference

Cheng Zhang, Judith Butepage, Hedvig Kjellstrom, Stephan Mandt

TL;DR

This work surveys variational inference across four interconnected axes: scalability, general applicability beyond conjugate models, accuracy of approximations, and amortized inference for rapid, data-conditional predictions. It foregrounds stochastic variational inference and black-box/ reparameterization-based methods to handle large datasets and non-conjugate models, while also detailing structured and alternative-divergence approaches to improve posterior fidelity. The paper highlights advances in VAEs, normalizing flows, Stein-discrepancy methods, and hierarchical/temporal variational forms as major pillars enabling Bayesian deep learning and scalable probabilistic modeling. By outlining practical tricks, theoretical developments, and probabilistic programming tools, it argues that VI is central to modern uncertainty-aware AI with broad applicability to deep generative modeling, time-series analysis, and large-scale inference. The review concludes with a roadmap for automatic VI and future research directions in theory, automation, and integration with deep learning systems.

Abstract

Many modern unsupervised or semi-supervised machine learning algorithms rely on Bayesian probabilistic models. These models are usually intractable and thus require approximate inference. Variational inference (VI) lets us approximate a high-dimensional Bayesian posterior with a simpler variational distribution by solving an optimization problem. This approach has been successfully used in various models and large-scale applications. In this review, we give an overview of recent trends in variational inference. We first introduce standard mean field variational inference, then review recent advances focusing on the following aspects: (a) scalable VI, which includes stochastic approximations, (b) generic VI, which extends the applicability of VI to a large class of otherwise intractable models, such as non-conjugate models, (c) accurate VI, which includes variational models beyond the mean field approximation or with atypical divergences, and (d) amortized VI, which implements the inference over local latent variables with inference networks. Finally, we provide a summary of promising future research directions.

Advances in Variational Inference

TL;DR

This work surveys variational inference across four interconnected axes: scalability, general applicability beyond conjugate models, accuracy of approximations, and amortized inference for rapid, data-conditional predictions. It foregrounds stochastic variational inference and black-box/ reparameterization-based methods to handle large datasets and non-conjugate models, while also detailing structured and alternative-divergence approaches to improve posterior fidelity. The paper highlights advances in VAEs, normalizing flows, Stein-discrepancy methods, and hierarchical/temporal variational forms as major pillars enabling Bayesian deep learning and scalable probabilistic modeling. By outlining practical tricks, theoretical developments, and probabilistic programming tools, it argues that VI is central to modern uncertainty-aware AI with broad applicability to deep generative modeling, time-series analysis, and large-scale inference. The review concludes with a roadmap for automatic VI and future research directions in theory, automation, and integration with deep learning systems.

Abstract

Many modern unsupervised or semi-supervised machine learning algorithms rely on Bayesian probabilistic models. These models are usually intractable and thus require approximate inference. Variational inference (VI) lets us approximate a high-dimensional Bayesian posterior with a simpler variational distribution by solving an optimization problem. This approach has been successfully used in various models and large-scale applications. In this review, we give an overview of recent trends in variational inference. We first introduce standard mean field variational inference, then review recent advances focusing on the following aspects: (a) scalable VI, which includes stochastic approximations, (b) generic VI, which extends the applicability of VI to a large class of otherwise intractable models, such as non-conjugate models, (c) accurate VI, which includes variational models beyond the mean field approximation or with atypical divergences, and (d) amortized VI, which implements the inference over local latent variables with inference networks. Finally, we provide a summary of promising future research directions.

Paper Structure

This paper contains 61 sections, 45 equations, 2 figures.

Figures (2)

  • Figure 1: A graphical model of the observations $\bm{x}$ that depend on underlying local hidden factors $\bm{\xi}$ and global parameters $\theta$. We use $\bm{z} =\{\theta, \bm{\xi}\}$ to represent all latent variables. $M$ is the number of the data points. $N$ is the number of the latent variables.
  • Figure 2: The graphical representation of stochastic variational inference (a) and the variational autoencoder (b). Dashed lines indicate variational approximations.