Table of Contents
Fetching ...

Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling

Mohammad R. Rezaei, Milos R. Popovic, Milad Lankarany, Rahul G. Krishnan

TL;DR

IDFF presents a momentum-augmented, higher-order dynamical flow fusion framework that upgrades conditional flow matching by injecting a learnable momentum term directly into the sampling vector field and by learning in the data sample space. This design eliminates the need for computationally expensive OT steps, enables larger integration steps, and achieves substantial reductions in the number of function evaluations while preserving sample fidelity across both image and time-series domains. The approach extends to higher-order momentum and supports time-series as well as static data, with a training objective that jointly learns the denoised target and multiple orders of the log-density derivatives. Empirically, IDFF attains competitive or superior performance to CFMs and diffusion-based methods on CIFAR-10, CelebA, MD simulations, and SST forecasting, often with NFEs as low as 5 and significantly faster sampling, highlighting its practical impact for fast, flexible generative modeling. Limitations include increased computational cost for higher-order derivatives and reliance on backbone architectures; future work aims to optimize higher-order terms and explore broader applications such as audio and biological time-series modeling.

Abstract

Conditional Flow Matching (CFM) models can generate high-quality samples from a non-informative prior, but they can be slow, often needing hundreds of network evaluations (NFE). To address this, we propose Implicit Dynamical Flow Fusion (IDFF); IDFF learns a new vector field with an additional momentum term that enables taking longer steps during sample generation while maintaining the fidelity of the generated distribution. Consequently, IDFFs reduce the NFEs by a factor of ten (relative to CFMs) without sacrificing sample quality, enabling rapid sampling and efficient handling of image and time-series data generation tasks. We evaluate IDFF on standard benchmarks such as CIFAR-10 and CelebA for image generation, where we achieve likelihood and quality performance comparable to CFMs and diffusion-based models with fewer NFEs. IDFF also shows superior performance on time-series datasets modeling, including molecular simulation and sea surface temperature (SST) datasets, highlighting its versatility and effectiveness across different domains.\href{https://github.com/MrRezaeiUofT/IDFF}{Github Repository}

Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling

TL;DR

IDFF presents a momentum-augmented, higher-order dynamical flow fusion framework that upgrades conditional flow matching by injecting a learnable momentum term directly into the sampling vector field and by learning in the data sample space. This design eliminates the need for computationally expensive OT steps, enables larger integration steps, and achieves substantial reductions in the number of function evaluations while preserving sample fidelity across both image and time-series domains. The approach extends to higher-order momentum and supports time-series as well as static data, with a training objective that jointly learns the denoised target and multiple orders of the log-density derivatives. Empirically, IDFF attains competitive or superior performance to CFMs and diffusion-based methods on CIFAR-10, CelebA, MD simulations, and SST forecasting, often with NFEs as low as 5 and significantly faster sampling, highlighting its practical impact for fast, flexible generative modeling. Limitations include increased computational cost for higher-order derivatives and reliance on backbone architectures; future work aims to optimize higher-order terms and explore broader applications such as audio and biological time-series modeling.

Abstract

Conditional Flow Matching (CFM) models can generate high-quality samples from a non-informative prior, but they can be slow, often needing hundreds of network evaluations (NFE). To address this, we propose Implicit Dynamical Flow Fusion (IDFF); IDFF learns a new vector field with an additional momentum term that enables taking longer steps during sample generation while maintaining the fidelity of the generated distribution. Consequently, IDFFs reduce the NFEs by a factor of ten (relative to CFMs) without sacrificing sample quality, enabling rapid sampling and efficient handling of image and time-series data generation tasks. We evaluate IDFF on standard benchmarks such as CIFAR-10 and CelebA for image generation, where we achieve likelihood and quality performance comparable to CFMs and diffusion-based models with fewer NFEs. IDFF also shows superior performance on time-series datasets modeling, including molecular simulation and sea surface temperature (SST) datasets, highlighting its versatility and effectiveness across different domains.\href{https://github.com/MrRezaeiUofT/IDFF}{Github Repository}
Paper Structure (51 sections, 1 theorem, 73 equations, 15 figures, 10 tables, 3 algorithms)

This paper contains 51 sections, 1 theorem, 73 equations, 15 figures, 10 tables, 3 algorithms.

Key Result

Lemma 1

(Probability Flow ODE for IDFF) Let $\mathbf{v}_t(\mathbf{x}_t)$ be a vector field that generates $p_t(\mathbf{x}_t)$. Define: If $\sigma_t \to 0$ and $\{\gamma_t^k\}_{k=1}^{K} \to 0$ as $t \to 0, 1$, then the generative process defined by $\mathbf{w}_t(\mathbf{x}_t)$ follows the same marginal distribution as that of the original CFMs governed by Equation eq:ode-fm.

Figures (15)

  • Figure 1: Image generation using IDFF across datasets. Additional samples and analysis are provided in Appendix \ref{['app:aditional-result-image']}.
  • Figure 2: Comparison of trajectory sampling between OT-CFM and IDFF with different orders at NFE=2: The figure displays 4096 final samples generated by each model, with KDE contours shown in blue and ground truth samples represented by black contours. Twelve individual trajectories are overlaid to illustrate the sampling paths. Among the models, the 3rd-order IDFF produces the closest distribution to the true distribution based on the Maximum Mean Discrepancy (MMD) metric.
  • Figure 3: Evaluation of NFE across Iterations (A) and Hyperparameters $\gamma^1_t =\gamma^1 \sigma^2_t$, $\gamma^0_t = \gamma^2 \sigma^2_t$ (B).
  • Figure 4: (A) True and (B) generated dihedral angles. (C) The dihedral angles for an alanine molecule. (D) True and generated dihedral angle trajectories using IDFF-1st order.
  • Figure 5: A) Comparison of trajectory sampling between 1st-order IDFF and OT-CFMs: The figure displays 4096 final samples generated by IDFF. As shown, IDFF takes larger steps toward the target distribution, guided by the momentum term. B) OT-CFMs sampling process. C) IDFF sampling process. In this process, $\hat{\mathbf{x}}_1(.)$ approximates the data sample $\mathbf{x}_1$, $\hat{\epsilon}(.)$ approximates the scores associated with $\boldsymbol{\xi}_t$, and $\mathbf{w}_t(.)$ is the calculated vector field by equation \ref{['eq:prob_ode_idff']}. The key difference between IDFF and OT-CFMs is the vector field generation: 1st-order IDFF generates $\hat{\mathbf{x}}_1(.)$ and $\hat{\epsilon}(.)$ in sample space and then reconstructs the vector field, and uses momentum to guide the sampling process.
  • ...and 10 more figures

Theorems & Definitions (1)

  • Lemma 1