Table of Contents
Fetching ...

Training Flow Matching: The Role of Weighting and Parameterization

Anne Gagneux, Ségolène Martin, Rémi Gribonval, Mathurin Massias

TL;DR

The goal of this systematic numerical study is to disentangle the various factors that matter when training a flow matching model, in order to provide practical insights on design choices.

Abstract

We study the training objectives of denoising-based generative models, with a particular focus on loss weighting and output parameterization, including noise-, clean image-, and velocity-based formulations. Through a systematic numerical study, we analyze how these training choices interact with the intrinsic dimensionality of the data manifold, model architecture, and dataset size. Our experiments span synthetic datasets with controlled geometry as well as image data, and compare training objectives using quantitative metrics for denoising accuracy (PSNR across noise levels) and generative quality (FID). Rather than proposing a new method, our goal is to disentangle the various factors that matter when training a flow matching model, in order to provide practical insights on design choices.

Training Flow Matching: The Role of Weighting and Parameterization

TL;DR

The goal of this systematic numerical study is to disentangle the various factors that matter when training a flow matching model, in order to provide practical insights on design choices.

Abstract

We study the training objectives of denoising-based generative models, with a particular focus on loss weighting and output parameterization, including noise-, clean image-, and velocity-based formulations. Through a systematic numerical study, we analyze how these training choices interact with the intrinsic dimensionality of the data manifold, model architecture, and dataset size. Our experiments span synthetic datasets with controlled geometry as well as image data, and compare training objectives using quantitative metrics for denoising accuracy (PSNR across noise levels) and generative quality (FID). Rather than proposing a new method, our goal is to disentangle the various factors that matter when training a flow matching model, in order to provide practical insights on design choices.
Paper Structure (29 sections, 7 equations, 8 figures, 5 tables)

This paper contains 29 sections, 7 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: PSNR and FID for the different losses, CIFAR-10. Models that reach the highest PSNR (low difference in PSNR compared to standard FM, $w^t_\mathrm{vel}$) also reach the lowest FID.
  • Figure 2: PSNR and FID for the different parametrizations, CIFAR-10. Models that reach the highest PSNR (low difference in PSNR compared to standard FM) also reach the lowest FID.
  • Figure 3: Denoising and generation performance of $\mathcal{C}_{\mathrm{vel}}$ versus $\mathcal{C}_{\mathrm{den}}$ when varying the patch size in the ViT architecture. CIFAR-10 (dimension $3 \times 32^2$). In the notation ViT/$p$, $p$ denotes a patch size $p \times p$. Green indicates that $\mathcal{C}_{\mathrm{den}}$ performs better than $\mathcal{C}_{\mathrm{vel}}$, red indicates the opposite.
  • Figure 4: 9 samples from the Fourier-32 dataset with controlled manifold dimension $m$.
  • Figure 5: PSNR gap on the Fourier-32 dataset for intrinsic dimensions $m\in\{4,8,16\}$, across four architectures. Green indicates that $\mathcal{C}_{\mathrm{den}}$ performs better than $\mathcal{C}_{\mathrm{vel}}$, red indicates the opposite.
  • ...and 3 more figures