Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling

Denis Blessing; Xiaogang Jia; Johannes Esslinger; Francisco Vargas; Gerhard Neumann

Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling

Denis Blessing, Xiaogang Jia, Johannes Esslinger, Francisco Vargas, Gerhard Neumann

TL;DR

This work introduces a benchmark that evaluates sampling methods using a standardized task suite and a broad range of performance criteria, and studies existing metrics for quantifying mode collapse and introduces novel metrics for this purpose.

Abstract

Monte Carlo methods, Variational Inference, and their combinations play a pivotal role in sampling from intractable probability distributions. However, current studies lack a unified evaluation framework, relying on disparate performance measures and limited method comparisons across diverse tasks, complicating the assessment of progress and hindering the decision-making of practitioners. In response to these challenges, our work introduces a benchmark that evaluates sampling methods using a standardized task suite and a broad range of performance criteria. Moreover, we study existing metrics for quantifying mode collapse and introduce novel metrics for this purpose. Our findings provide insights into strengths and weaknesses of existing sampling methods, serving as a valuable reference for future developments. The code is publicly available here.

Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling

TL;DR

Abstract

Paper Structure (33 sections, 36 equations, 7 figures, 17 tables)

This paper contains 33 sections, 36 equations, 7 figures, 17 tables.

Introduction
Preliminaries
Quantifying Mode-Collapse
Benchmarking Methods
Benchmarking Target Densities
Hyperparameters and Tuning
Experiments
Evaluation on Synthetic Target Densities
Evaluation on Real Target Densities
Discussion and Conclusion
Conclusion
Performance Criteria Details
Density-Ratio-Based Criteria
Integral Probability Metrics
Extending the Entropic Mode Coverage
...and 18 more sections

Figures (7)

Figure 1: Illustration of the evidence upper (EUBO) and lower bound (ELBO). The mode-seeking nature of reverse KL results in $\text{ELBO} \ll \log Z$ if the model density $q^{\theta}$ (indicated by the samples $\color{red}{\times}$) averages over the target $\pi$ (indicated by the level plot) ($t_1$) and $\text{ELBO} \approx \log Z$ if $\pi \geq 0$ whenever $q^{\theta} \geq 0$ ($t_2-t_4)$. As a result, the ELBO is not sensitive to mode collapse. In contrast, the mass-covering nature of the forward KL ensures that $\text{EUBO} \gg \log Z$ if $q^{\theta} \approx 0$ whenever $\pi > 0$ ($t_2)$ and $\text{EUBO} \approx \log Z$ if $q^{\theta} \geq 0$ whenever $\pi \geq 0$ ($t_1$). Consequently, the EUBO is well suited to quantify mode collapse.
Figure 2: Mean and standard deviation of EMC values for MoG and MoS across varying dimensions $d$.
Figure 3: Synthetic target densities. Left: First two dimensions of the funnel density. Middle: Mixture of Student-t distribution with $15$ components (MoS). Right: Mixture of $40$ isotropic Gaussian distributions (MoG).
Figure 4: Visualization of samples drawn from different sampling methods for Funnel (top) and MoG (bottom).
Figure 5: Visualization of samples drawn from different sampling methods for Digits (top) and Fashion (bottom).
...and 2 more figures

Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling

TL;DR

Abstract

Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling

Authors

TL;DR

Abstract

Table of Contents

Figures (7)