Generative Modeling with Flow-Guided Density Ratio Learning
Alvin Heng, Abdul Fatir Ansari, Harold Soh
TL;DR
FDRL tackles the density-chasm problem in gradient-flow generative modeling by progressively training a density-ratio estimator on samples refined via a stale flow, bridging a simple prior and complex data distributions without extra generator training. By combining data-dependent priors with flow-guided training, FDRL scales to high-dimensional image synthesis (up to $128\times128$) and extends naturally to class-conditional generation and unpaired image-to-image translation. Empirical results show competitive performance against gradient-flow baselines, with clear benefits from the two-stage sampling strategy and the ability to leverage pretrained classifiers for conditioning. The approach broadens the applicability of gradient-flow methods, offering a simple, scalable alternative to diffusion and EBMs for several practical generative tasks.
Abstract
We present Flow-Guided Density Ratio Learning (FDRL), a simple and scalable approach to generative modeling which builds on the stale (time-independent) approximation of the gradient flow of entropy-regularized f-divergences introduced in recent work. Specifically, the intractable time-dependent density ratio is approximated by a stale estimator given by a GAN discriminator. This is sufficient in the case of sample refinement, where the source and target distributions of the flow are close to each other. However, this assumption is invalid for generation and a naive application of the stale estimator fails due to the large chasm between the two distributions. FDRL proposes to train a density ratio estimator such that it learns from progressively improving samples during the training process. We show that this simple method alleviates the density chasm problem, allowing FDRL to generate images of dimensions as high as $128\times128$, as well as outperform existing gradient flow baselines on quantitative benchmarks. We also show the flexibility of FDRL with two use cases. First, unconditional FDRL can be easily composed with external classifiers to perform class-conditional generation. Second, FDRL can be directly applied to unpaired image-to-image translation with no modifications needed to the framework. Our code is publicly available at ttps://github.com/clear-nus/fdrl.
