Table of Contents
Fetching ...

Flow Generator Matching

Zemin Huang, Zhengyang Geng, Weijian Luo, Guo-jun Qi

TL;DR

This work introduces Flow Generator Matching (FGM), a principled approach to distill flow-matching models into one-step generators, drastically speeding up sampling while preserving fidelity. By formulating an equivalent, tractable training objective that aligns the implicit generator flow with a pre-trained FM flow, and proving key flow-product identities, FGM provides theoretical guarantees for one-step performance. Empirically, FGM yields state-of-the-art one-step results on CIFAR10 and strong GenEval performance for one-step text-to-image generation from MM-DiT-SD3, illustrating both sample quality and efficiency gains. The method avoids explicit density modeling, distinguishing it from diffusion distillation, and outperforms existing flow-distillation techniques like CFM in several benchmarks. While effective, FGM relies on an auxiliary online flow model and remains data-free, suggesting future work to integrate data and further reduce memory requirements.

Abstract

In the realm of Artificial Intelligence Generated Content (AIGC), flow-matching models have emerged as a powerhouse, achieving success due to their robust theoretical underpinnings and solid ability for large-scale generative modeling. These models have demonstrated state-of-the-art performance, but their brilliance comes at a cost. The process of sampling from these models is notoriously demanding on computational resources, as it necessitates the use of multi-step numerical ordinary differential equations (ODEs). Against this backdrop, this paper presents a novel solution with theoretical guarantees in the form of Flow Generator Matching (FGM), an innovative approach designed to accelerate the sampling of flow-matching models into a one-step generation, while maintaining the original performance. On the CIFAR10 unconditional generation benchmark, our one-step FGM model achieves a new record Fréchet Inception Distance (FID) score of 3.08 among few-step flow-matching-based models, outperforming original 50-step flow-matching models. Furthermore, we use the FGM to distill the Stable Diffusion 3, a leading text-to-image flow-matching model based on the MM-DiT architecture. The resulting MM-DiT-FGM one-step text-to-image model demonstrates outstanding industry-level performance. When evaluated on the GenEval benchmark, MM-DiT-FGM has delivered remarkable generating qualities, rivaling other multi-step models in light of the efficiency of a single generation step.

Flow Generator Matching

TL;DR

This work introduces Flow Generator Matching (FGM), a principled approach to distill flow-matching models into one-step generators, drastically speeding up sampling while preserving fidelity. By formulating an equivalent, tractable training objective that aligns the implicit generator flow with a pre-trained FM flow, and proving key flow-product identities, FGM provides theoretical guarantees for one-step performance. Empirically, FGM yields state-of-the-art one-step results on CIFAR10 and strong GenEval performance for one-step text-to-image generation from MM-DiT-SD3, illustrating both sample quality and efficiency gains. The method avoids explicit density modeling, distinguishing it from diffusion distillation, and outperforms existing flow-distillation techniques like CFM in several benchmarks. While effective, FGM relies on an auxiliary online flow model and remains data-free, suggesting future work to integrate data and further reduce memory requirements.

Abstract

In the realm of Artificial Intelligence Generated Content (AIGC), flow-matching models have emerged as a powerhouse, achieving success due to their robust theoretical underpinnings and solid ability for large-scale generative modeling. These models have demonstrated state-of-the-art performance, but their brilliance comes at a cost. The process of sampling from these models is notoriously demanding on computational resources, as it necessitates the use of multi-step numerical ordinary differential equations (ODEs). Against this backdrop, this paper presents a novel solution with theoretical guarantees in the form of Flow Generator Matching (FGM), an innovative approach designed to accelerate the sampling of flow-matching models into a one-step generation, while maintaining the original performance. On the CIFAR10 unconditional generation benchmark, our one-step FGM model achieves a new record Fréchet Inception Distance (FID) score of 3.08 among few-step flow-matching-based models, outperforming original 50-step flow-matching models. Furthermore, we use the FGM to distill the Stable Diffusion 3, a leading text-to-image flow-matching model based on the MM-DiT architecture. The resulting MM-DiT-FGM one-step text-to-image model demonstrates outstanding industry-level performance. When evaluated on the GenEval benchmark, MM-DiT-FGM has delivered remarkable generating qualities, rivaling other multi-step models in light of the efficiency of a single generation step.

Paper Structure

This paper contains 42 sections, 2 theorems, 30 equations, 5 figures, 4 tables, 1 algorithm.

Key Result

Theorem 4.1

Let $\bold f(\cdot,\theta)$ be a vector-valued function, using the notations in Section sec:setup, under mild conditions, the identity holds:

Figures (5)

  • Figure 1: Qualitative Evaluation of one-step samples from MM-DiT-FGM. Prompts used in this figure can be found in the Appendix \ref{['app:eval_prompts']}.
  • Figure 2: The visual comparison between our MM-DiT-FGM and other methods. From left to right, the first column is 28-step SD3 modelesser2024scaling, the second column is the 4-step Hyper-SD3 modelren2024hyper, the third column is the 4-step Flash-SD3 modelchadebec2024flash. The prompts for these images are provided in \ref{['app:eval_prompts']}
  • Figure 3: Visualizations of generated samples from FGM-1step models and 50-step teacher flow models on CIFAR10 datasets. On both conditional and unconditional generation, FGM-1step models outperform 50-step teacher flow models.
  • Figure 4: Unconditional samplers from 1-step FGM model on CIFAR10.
  • Figure 5: Conditional samplers from 1-step FGM model on CIFAR10.

Theorems & Definitions (2)

  • Theorem 4.1: Flow Product Identity
  • Theorem 4.2