Table of Contents
Fetching ...

Generative Modeling with Flow-Guided Density Ratio Learning

Alvin Heng, Abdul Fatir Ansari, Harold Soh

TL;DR

FDRL tackles the density-chasm problem in gradient-flow generative modeling by progressively training a density-ratio estimator on samples refined via a stale flow, bridging a simple prior and complex data distributions without extra generator training. By combining data-dependent priors with flow-guided training, FDRL scales to high-dimensional image synthesis (up to $128\times128$) and extends naturally to class-conditional generation and unpaired image-to-image translation. Empirical results show competitive performance against gradient-flow baselines, with clear benefits from the two-stage sampling strategy and the ability to leverage pretrained classifiers for conditioning. The approach broadens the applicability of gradient-flow methods, offering a simple, scalable alternative to diffusion and EBMs for several practical generative tasks.

Abstract

We present Flow-Guided Density Ratio Learning (FDRL), a simple and scalable approach to generative modeling which builds on the stale (time-independent) approximation of the gradient flow of entropy-regularized f-divergences introduced in recent work. Specifically, the intractable time-dependent density ratio is approximated by a stale estimator given by a GAN discriminator. This is sufficient in the case of sample refinement, where the source and target distributions of the flow are close to each other. However, this assumption is invalid for generation and a naive application of the stale estimator fails due to the large chasm between the two distributions. FDRL proposes to train a density ratio estimator such that it learns from progressively improving samples during the training process. We show that this simple method alleviates the density chasm problem, allowing FDRL to generate images of dimensions as high as $128\times128$, as well as outperform existing gradient flow baselines on quantitative benchmarks. We also show the flexibility of FDRL with two use cases. First, unconditional FDRL can be easily composed with external classifiers to perform class-conditional generation. Second, FDRL can be directly applied to unpaired image-to-image translation with no modifications needed to the framework. Our code is publicly available at ttps://github.com/clear-nus/fdrl.

Generative Modeling with Flow-Guided Density Ratio Learning

TL;DR

FDRL tackles the density-chasm problem in gradient-flow generative modeling by progressively training a density-ratio estimator on samples refined via a stale flow, bridging a simple prior and complex data distributions without extra generator training. By combining data-dependent priors with flow-guided training, FDRL scales to high-dimensional image synthesis (up to ) and extends naturally to class-conditional generation and unpaired image-to-image translation. Empirical results show competitive performance against gradient-flow baselines, with clear benefits from the two-stage sampling strategy and the ability to leverage pretrained classifiers for conditioning. The approach broadens the applicability of gradient-flow methods, offering a simple, scalable alternative to diffusion and EBMs for several practical generative tasks.

Abstract

We present Flow-Guided Density Ratio Learning (FDRL), a simple and scalable approach to generative modeling which builds on the stale (time-independent) approximation of the gradient flow of entropy-regularized f-divergences introduced in recent work. Specifically, the intractable time-dependent density ratio is approximated by a stale estimator given by a GAN discriminator. This is sufficient in the case of sample refinement, where the source and target distributions of the flow are close to each other. However, this assumption is invalid for generation and a naive application of the stale estimator fails due to the large chasm between the two distributions. FDRL proposes to train a density ratio estimator such that it learns from progressively improving samples during the training process. We show that this simple method alleviates the density chasm problem, allowing FDRL to generate images of dimensions as high as , as well as outperform existing gradient flow baselines on quantitative benchmarks. We also show the flexibility of FDRL with two use cases. First, unconditional FDRL can be easily composed with external classifiers to perform class-conditional generation. Second, FDRL can be directly applied to unpaired image-to-image translation with no modifications needed to the framework. Our code is publicly available at ttps://github.com/clear-nus/fdrl.
Paper Structure (32 sections, 2 theorems, 29 equations, 21 figures, 4 tables, 2 algorithms)

This paper contains 32 sections, 2 theorems, 29 equations, 21 figures, 4 tables, 2 algorithms.

Key Result

Lemma 0

Let $\gamma=1$ and assume that $q({\mathbf{x}})\sim U[a,b]$, where $a,b$ are chosen appropriately (e.g., [-1,1] for image pixels). Then the stationary distribution of Eq. eq:sde_stale, $\rho_\infty({\mathbf{x}})$, has the same maximum likelihood estimate as $p({\mathbf{x}})$, $\mathop{\mathrm{arg\,m

Figures (21)

  • Figure 1: Left: Illustration of FDRL's training setup for the $\tau$ training iteration. For clarity, we emphasize the choices of the $f$-divergence and $g$, the Bregman divergence function, as part of training. Right: the various applications of FDRL, ranging from unconditional image generation to class-conditional generation by composition with external classifiers and unpaired image-to-image translation.
  • Figure 2: Toy experiments by simulating the flow Eq. \ref{['eq:sde_stale']} with an MLP density ratio estimator. The source distribution $q({\mathbf{x}})$ is set as $\mathcal{N}(\mathbf{0}, 0.1\mathbb{I})$. Blue particles represent the source particles, and orange particles represent the same particles after flowing for $K$ steps. (a) We set $p({\mathbf{x}}) \sim \mathcal{N}(\mathbf{1}, 0.1\mathbb{I})$ and flow for $K=15$ steps. (b) We set $p({\mathbf{x}}) \sim \mathcal{N}(\mathbf{6}, 0.1\mathbb{I})$ and flow for $K=400$ steps. We can clearly see that particles in (a) have converged to the target distribution, while particles in (b) have not, demonstrating the density chasm problem.
  • Figure 3: Plot of $\Tilde{q}_\tau$ as training progresses in the toy example of Fig. \ref{['fig:stale_no_works']}. The total number of training steps is $T=1000$, such that $\tau \in [0, T-1]$. In the top row, blue particles are samples from $q'$ while orange particles are samples from $\Tilde{q}_\tau$ at specific training steps. Left: $\tau = 10$, center: $\tau=40$, right: $\tau=1000$. In the bottom row, we plot the trajectory of the mean of $\Tilde{q}_\tau$ as training progresses.
  • Figure 4: Samples from FDRL-DDP on CIFAR10 $32^2$, CelebA $64^2$ and LSUN Church $128^2$ using LSIF-$\chi^2$. More results using various BD objectives and $f$-divergences can be found in Appendix.
  • Figure 5: FID as a function of the total flow length on CIFAR10 for LSIF-$\chi^2$ DDP when sampling from a model trained with $K=100$.
  • ...and 16 more figures

Theorems & Definitions (3)

  • Lemma 0
  • Lemma 0
  • proof