Table of Contents
Fetching ...

Unifying Generative Models with GFlowNets and Beyond

Dinghuai Zhang, Ricky T. Q. Chen, Nikolay Malkin, Yoshua Bengio

TL;DR

The paper addresses the fragmentation of deep generative modeling by proposing GFlowNets as a unifying probabilistic framework that treats sampling as Markovian trajectories on a DAG, with forward/backward policies and flow-based constraints. It shows how a broad range of models, including HVAE, diffusion models, the Schrödinger bridge, autoregressive models, and normalizing flows, can be interpreted as instantiations of GFlowNets with different policy specifications, and it derives connections to standard training objectives via trajectory balance and KL divergences. A practical contribution is the MLE-GFN algorithm, which uses Trajectory Balance Consistency as a regularization to improve generative modeling, demonstrated on synthetic 2D tasks and a CIFAR-10 diffusion setting. The work provides a concrete recipe to leverage GFlowNet insights for improved modeling and sampling, with potential impact on training efficiency, mode coverage, and the integration of diverse generative paradigms.

Abstract

There are many frameworks for deep generative modeling, each often presented with their own specific training algorithms and inference methods. Here, we demonstrate the connections between existing deep generative models and the recently introduced GFlowNet framework, a probabilistic inference machine which treats sampling as a decision-making process. This analysis sheds light on their overlapping traits and provides a unifying viewpoint through the lens of learning with Markovian trajectories. Our framework provides a means for unifying training and inference algorithms, and provides a route to shine a unifying light over many generative models. Beyond this, we provide a practical and experimentally verified recipe for improving generative modeling with insights from the GFlowNet perspective.

Unifying Generative Models with GFlowNets and Beyond

TL;DR

The paper addresses the fragmentation of deep generative modeling by proposing GFlowNets as a unifying probabilistic framework that treats sampling as Markovian trajectories on a DAG, with forward/backward policies and flow-based constraints. It shows how a broad range of models, including HVAE, diffusion models, the Schrödinger bridge, autoregressive models, and normalizing flows, can be interpreted as instantiations of GFlowNets with different policy specifications, and it derives connections to standard training objectives via trajectory balance and KL divergences. A practical contribution is the MLE-GFN algorithm, which uses Trajectory Balance Consistency as a regularization to improve generative modeling, demonstrated on synthetic 2D tasks and a CIFAR-10 diffusion setting. The work provides a concrete recipe to leverage GFlowNet insights for improved modeling and sampling, with potential impact on training efficiency, mode coverage, and the integration of diverse generative paradigms.

Abstract

There are many frameworks for deep generative modeling, each often presented with their own specific training algorithms and inference methods. Here, we demonstrate the connections between existing deep generative models and the recently introduced GFlowNet framework, a probabilistic inference machine which treats sampling as a decision-making process. This analysis sheds light on their overlapping traits and provides a unifying viewpoint through the lens of learning with Markovian trajectories. Our framework provides a means for unifying training and inference algorithms, and provides a route to shine a unifying light over many generative models. Beyond this, we provide a practical and experimentally verified recipe for improving generative modeling with insights from the GFlowNet perspective.
Paper Structure (31 sections, 6 theorems, 38 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 31 sections, 6 theorems, 38 equations, 3 figures, 3 tables, 1 algorithm.

Key Result

Proposition 2

Training hierarchical latent variable models with the KL-trajectory balance $\mathcal{D}_{\mathrm{KL}}\left( P_B(\tau)\Vert P_F(\tau)\right)$ objective is equivalent to training HVAEs by maximizing its ELBO, in the sense of having the same global optimum.

Figures (3)

  • Figure 1: NLL performance of baseline and proposed methods. The gray line shows the results of the DDPM baseline; the colored lines show the results of the proposed GFlowNet consistency augmented training started with the pretrained weights of the baseline method from $20000, 40000, \ldots, 180000$ steps, respectively.
  • Figure 2: Top: Visualization of the samples for synthetic problems from ground truth. Bottom: Visualization of the samples generated with the proposed GFlowNet-based algorithm.
  • Figure 3: Left: visualization of trajectory examples on CIFAR-$10$ image space. Right: MLE-GFN generated samples.

Theorems & Definitions (8)

  • Proposition 2
  • Proposition 4: informal
  • Proposition 8: ksendal1985StochasticDE
  • Proposition 9
  • Proposition 13
  • Proposition 14
  • proof
  • proof