Table of Contents
Fetching ...

DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows

Mashrur M. Morshed, Vishnu Boddeti

TL;DR

DiverseFlow addresses the challenge of obtaining diverse target samples from flow-based generative models under a fixed sampling budget by introducing a training-free, inference-time mechanism that couples multiple flow trajectories via a determinantal point process (DPP). It defines a volume-based diversity objective through a kernel $L$ and log-likelihood $\mathcal{L} = \det(L)/\det(L+I)$, and incorporates its gradient into the flow dynamics as $d\mathbf{x}_t^{(i)} = [v_\theta(\mathbf{x}_t^{(i)},t) - \gamma(t) \nabla_{\mathbf{x}_t^{(i)}} \log \mathcal{L}(\{\hat{x}_1^{(1)},\dots,\hat{x}_1^{(k)}\})]\,dt$, with $\hat{x}_1^{(i)} = \mathbf{x}_t^{(i)} + v_\theta(\mathbf{x}_t^{(i)},t)(1-t)$. The method yields improved mode coverage across text-guided image generation with polysemous prompts, large-hole inpainting, and class-conditioned synthesis, and shows consistency across multiple flow-matching formulations. This training-free, inference-time coupling offers a practical path to richer sample diversity, while highlighting limitations such as reliance on the learned FM modes, computational cost, and entangled meanings in ambiguous prompts. Overall, DiverseFlow provides a foundational approach to diversify flows without retraining, paving the way for future work on disentangling meanings and integrating DPPs with training-based models.

Abstract

Many real-world applications of flow-based generative models desire a diverse set of samples that cover multiple modes of the target distribution. However, the predominant approach for obtaining diverse sets is not sample-efficient, as it involves independently obtaining many samples from the source distribution and mapping them through the flow until the desired mode coverage is achieved. As an alternative to repeated sampling, we introduce DiverseFlow: a training-free approach to improve the diversity of flow models. Our key idea is to employ a determinantal point process to induce a coupling between the samples that drives diversity under a fixed sampling budget. In essence, DiverseFlow allows exploration of more variations in a learned flow model with fewer samples. We demonstrate the efficacy of our method for tasks where sample-efficient diversity is desirable, such as text-guided image generation with polysemous words, inverse problems like large-hole inpainting, and class-conditional image synthesis.

DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows

TL;DR

DiverseFlow addresses the challenge of obtaining diverse target samples from flow-based generative models under a fixed sampling budget by introducing a training-free, inference-time mechanism that couples multiple flow trajectories via a determinantal point process (DPP). It defines a volume-based diversity objective through a kernel and log-likelihood , and incorporates its gradient into the flow dynamics as , with . The method yields improved mode coverage across text-guided image generation with polysemous prompts, large-hole inpainting, and class-conditioned synthesis, and shows consistency across multiple flow-matching formulations. This training-free, inference-time coupling offers a practical path to richer sample diversity, while highlighting limitations such as reliance on the learned FM modes, computational cost, and entangled meanings in ambiguous prompts. Overall, DiverseFlow provides a foundational approach to diversify flows without retraining, paving the way for future work on disentangling meanings and integrating DPPs with training-based models.

Abstract

Many real-world applications of flow-based generative models desire a diverse set of samples that cover multiple modes of the target distribution. However, the predominant approach for obtaining diverse sets is not sample-efficient, as it involves independently obtaining many samples from the source distribution and mapping them through the flow until the desired mode coverage is achieved. As an alternative to repeated sampling, we introduce DiverseFlow: a training-free approach to improve the diversity of flow models. Our key idea is to employ a determinantal point process to induce a coupling between the samples that drives diversity under a fixed sampling budget. In essence, DiverseFlow allows exploration of more variations in a learned flow model with fewer samples. We demonstrate the efficacy of our method for tasks where sample-efficient diversity is desirable, such as text-guided image generation with polysemous words, inverse problems like large-hole inpainting, and class-conditional image synthesis.

Paper Structure

This paper contains 30 sections, 12 equations, 20 figures, 2 tables, 2 algorithms.

Figures (20)

  • Figure 1: Text-guided image generation with polysemous words. With the same budget of samples, DiverseFlow (right) finds a more diverse set of results compared to IID sampling (left).
  • Figure 2: Finding $K=5$ samples from the target distribution with $N=10$ modes. In the example, only 3 modes are found by 5 IID samples (a, b). DiverseFlow discovers 5 modes with the same sampling budget in (d, e).
  • Figure 3: For each prompt, the left image ( red box) denotes standard IID sampling with classifier-free guidance, while the right image ( blue box) shows the result after incorporating DiverseFlow. DiverseFlow finds more diverse sets given the same source points---clearly distinguishable as new semantic meanings in the case of prompts with multiple meanings.
  • Figure 4: Comparing different FM formulations in terms of the number of modes spanned by IID sampling versus with DiverseFlow. More details about the experiment are provided in the supplementary.
  • Figure 5: Inpainting on CelebAHQ-$256\times256$. (a) Dashed boxes show masked input (top) and ground truth (bottom) respectively. (b) RectifiedFlow liu2022flow + MCG chung2022improving (c) With DiverseFlow.
  • ...and 15 more figures