Table of Contents
Fetching ...

Variational Potential Flow: A Novel Probabilistic Framework for Energy-Based Generative Modelling

Junn Yong Loo, Michelle Adeline, Arghya Pal, Vishnu Monn Baskaran, Chee-Ming Ting, Raphael C. -W. Phan

TL;DR

This work tackles the training inefficiency and instability of energy-based models (EBMs) caused by implicit MCMC sampling by introducing Variational Potential Flow (VAPO). It blends log-homotopy density interpolation, a potential-flow transport via ODEs, and a variational Deep Ritz–style energy loss to align a flow-driven prior with an approximate data-likelihood path, enabling sampling from a trained energy model without MCMC. The key contributions are (i) a log-homotopy framework bridging prior and data likelihood, (ii) a probabilistic Poisson-type equation guiding density evolution through a potential field, and (iii) a tractable energy loss with CNN-based energy parameterization that yields competitive unconditional image generation and smooth interpolation on CIFAR-10 and CelebA. The approach has potential to stabilize and accelerate training of EBMs, offering a scalable alternative to diffusion-like models, with practical impact for efficient probabilistic generative modelling across domains.

Abstract

Energy based models (EBMs) are appealing for their generality and simplicity in data likelihood modeling, but have conventionally been difficult to train due to the unstable and time-consuming implicit MCMC sampling during contrastive divergence training. In this paper, we present a novel energy-based generative framework, Variational Potential Flow (VAPO), that entirely dispenses with implicit MCMC sampling and does not rely on complementary latent models or cooperative training. The VAPO framework aims to learn a potential energy function whose gradient (flow) guides the prior samples, so that their density evolution closely follows an approximate data likelihood homotopy. An energy loss function is then formulated to minimize the Kullback-Leibler divergence between density evolution of the flow-driven prior and the data likelihood homotopy. Images can be generated after training the potential energy, by initializing the samples from Gaussian prior and solving the ODE governing the potential flow on a fixed time interval using generic ODE solvers. Experiment results show that the proposed VAPO framework is capable of generating realistic images on various image datasets. In particular, our proposed framework achieves competitive FID scores for unconditional image generation on the CIFAR-10 and CelebA datasets.

Variational Potential Flow: A Novel Probabilistic Framework for Energy-Based Generative Modelling

TL;DR

This work tackles the training inefficiency and instability of energy-based models (EBMs) caused by implicit MCMC sampling by introducing Variational Potential Flow (VAPO). It blends log-homotopy density interpolation, a potential-flow transport via ODEs, and a variational Deep Ritz–style energy loss to align a flow-driven prior with an approximate data-likelihood path, enabling sampling from a trained energy model without MCMC. The key contributions are (i) a log-homotopy framework bridging prior and data likelihood, (ii) a probabilistic Poisson-type equation guiding density evolution through a potential field, and (iii) a tractable energy loss with CNN-based energy parameterization that yields competitive unconditional image generation and smooth interpolation on CIFAR-10 and CelebA. The approach has potential to stabilize and accelerate training of EBMs, offering a scalable alternative to diffusion-like models, with practical impact for efficient probabilistic generative modelling across domains.

Abstract

Energy based models (EBMs) are appealing for their generality and simplicity in data likelihood modeling, but have conventionally been difficult to train due to the unstable and time-consuming implicit MCMC sampling during contrastive divergence training. In this paper, we present a novel energy-based generative framework, Variational Potential Flow (VAPO), that entirely dispenses with implicit MCMC sampling and does not rely on complementary latent models or cooperative training. The VAPO framework aims to learn a potential energy function whose gradient (flow) guides the prior samples, so that their density evolution closely follows an approximate data likelihood homotopy. An energy loss function is then formulated to minimize the Kullback-Leibler divergence between density evolution of the flow-driven prior and the data likelihood homotopy. Images can be generated after training the potential energy, by initializing the samples from Gaussian prior and solving the ODE governing the potential flow on a fixed time interval using generic ODE solvers. Experiment results show that the proposed VAPO framework is capable of generating realistic images on various image datasets. In particular, our proposed framework achieves competitive FID scores for unconditional image generation on the CIFAR-10 and CelebA datasets.
Paper Structure (30 sections, 3 theorems, 59 equations, 9 figures, 3 tables, 1 algorithm)

This paper contains 30 sections, 3 theorems, 59 equations, 9 figures, 3 tables, 1 algorithm.

Key Result

Proposition 1

Consider the data likelihood homotopy $\bar{\rho}(x; t)$ in (eq:marginalized_homotopy_h) with Gaussian conditional data likelihood $p(\bar{x}|x) = \mathcal{N}(\bar{x}; x, \Pi)$. Then, its evolution in time $t \in [0, 1]$ is given by the following PDE: where is the innovation term in the conditional data likelihood, and $\bar{\gamma}(x,\bar{x}) = \mathop{\mathrm{\mathbb{E}}}\nolimits_{\rho(x;\bar

Figures (9)

  • Figure 1: A planar visualization of the potential-generated field (represented by coloured arrows) that transports the prior particles towards the approximate data likelihood (represented by the blue contour).
  • Figure 2: Generated samples on unconditional CIFAR-10 $32\times 32$ (left) and CelebA $64\times 64$ (right).
  • Figure 3: Interpolation results between the leftmost and rightmost generated CelebA $64 \times 64$ samples.
  • Figure 4: Histogram of energy output for CIFAR-10 train and test set.
  • Figure 5: Generated samples and their five nearest neighbours in the CIFAR-10 train set based on pixel distance.
  • ...and 4 more figures

Theorems & Definitions (9)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • proof
  • proof
  • proof