Table of Contents
Fetching ...

Variational Learning of Fractional Posteriors

Kian Ming A. Chai, Edwin V. Bonilla

Abstract

We introduce a novel one-parameter variational objective that lower bounds the data evidence and enables the estimation of approximate fractional posteriors. We extend this framework to hierarchical construction and Bayes posteriors, offering a versatile tool for probabilistic modelling. We demonstrate two cases where gradients can be obtained analytically and a simulation study on mixture models showing that our fractional posteriors can be used to achieve better calibration compared to posteriors from the conventional variational bound. When applied to variational autoencoders (VAEs), our approach attains higher evidence bounds and enables learning of high-performing approximate Bayes posteriors jointly with fractional posteriors. We show that VAEs trained with fractional posteriors produce decoders that are better aligned for generation from the prior.

Variational Learning of Fractional Posteriors

Abstract

We introduce a novel one-parameter variational objective that lower bounds the data evidence and enables the estimation of approximate fractional posteriors. We extend this framework to hierarchical construction and Bayes posteriors, offering a versatile tool for probabilistic modelling. We demonstrate two cases where gradients can be obtained analytically and a simulation study on mixture models showing that our fractional posteriors can be used to achieve better calibration compared to posteriors from the conventional variational bound. When applied to variational autoencoders (VAEs), our approach attains higher evidence bounds and enables learning of high-performing approximate Bayes posteriors jointly with fractional posteriors. We show that VAEs trained with fractional posteriors produce decoders that are better aligned for generation from the prior.

Paper Structure

This paper contains 26 sections, 20 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: We train VAEs on the Fashion-MNIST dataset using $\mathcal{L}_\gamma$ for different $\gamma$s. We obtain mean images from the decoding latent variables that are systematically sampled by coordinate-wise inverse-CDF (standard normal) transform from a unit square. Fig. b shows the images using the Bayes posterior (learnt with ELBO), and Fig. c shows those using a fractional posterior very close to the prior. The last image is the heat map of the corresponding prior densities.