Table of Contents
Fetching ...

Is There a Better Source Distribution than Gaussian? Exploring Source Distributions for Image Flow Matching

Junho Lee, Kwanseok Kim, Joonseok Lee

TL;DR

The paper investigates whether source distributions beyond Gaussian can outperform Gaussian in flow matching. It introduces a high-dimensional geometry–driven 2D simulation to study learning dynamics, revealing that density-based approximations can cause mode-discrepancy and that overly concentrated directional strategies induce path entanglement, while Gaussian sources provide robust omnidirectional supervision. Building on these insights, it proposes a practical hybrid method—Norm Alignment during training and Pruned Sampling at inference—that can be applied post-hoc to pretrained models, yielding consistent improvements in generation quality and sampling efficiency on CIFAR-10, ImageNet64, and related tasks. The work offers actionable guidelines for designing source distributions in flow matching and provides code to facilitate replication and adoption by practitioners.

Abstract

Flow matching has emerged as a powerful generative modeling approach with flexible choices of source distribution. While Gaussian distributions are commonly used, the potential for better alternatives in high-dimensional data generation remains largely unexplored. In this paper, we propose a novel 2D simulation that captures high-dimensional geometric properties in an interpretable 2D setting, enabling us to analyze the learning dynamics of flow matching during training. Based on this analysis, we derive several key insights about flow matching behavior: (1) density approximation can paradoxically degrade performance due to mode discrepancy, (2) directional alignment suffers from path entanglement when overly concentrated, (3) Gaussian's omnidirectional coverage ensures robust learning, and (4) norm misalignment incurs substantial learning costs. Building on these insights, we propose a practical framework that combines norm-aligned training with directionally-pruned sampling. This approach maintains the robust omnidirectional supervision essential for stable flow learning, while eliminating initializations in data-sparse regions during inference. Importantly, our pruning strategy can be applied to any flow matching model trained with a Gaussian source, providing immediate performance gains without the need for retraining. Empirical evaluations demonstrate consistent improvements in both generation quality and sampling efficiency. Our findings provide practical insights and guidelines for source distribution design and introduce a readily applicable technique for improving existing flow matching models. Our code is available at https://github.com/kwanseokk/SourceFM.

Is There a Better Source Distribution than Gaussian? Exploring Source Distributions for Image Flow Matching

TL;DR

The paper investigates whether source distributions beyond Gaussian can outperform Gaussian in flow matching. It introduces a high-dimensional geometry–driven 2D simulation to study learning dynamics, revealing that density-based approximations can cause mode-discrepancy and that overly concentrated directional strategies induce path entanglement, while Gaussian sources provide robust omnidirectional supervision. Building on these insights, it proposes a practical hybrid method—Norm Alignment during training and Pruned Sampling at inference—that can be applied post-hoc to pretrained models, yielding consistent improvements in generation quality and sampling efficiency on CIFAR-10, ImageNet64, and related tasks. The work offers actionable guidelines for designing source distributions in flow matching and provides code to facilitate replication and adoption by practitioners.

Abstract

Flow matching has emerged as a powerful generative modeling approach with flexible choices of source distribution. While Gaussian distributions are commonly used, the potential for better alternatives in high-dimensional data generation remains largely unexplored. In this paper, we propose a novel 2D simulation that captures high-dimensional geometric properties in an interpretable 2D setting, enabling us to analyze the learning dynamics of flow matching during training. Based on this analysis, we derive several key insights about flow matching behavior: (1) density approximation can paradoxically degrade performance due to mode discrepancy, (2) directional alignment suffers from path entanglement when overly concentrated, (3) Gaussian's omnidirectional coverage ensures robust learning, and (4) norm misalignment incurs substantial learning costs. Building on these insights, we propose a practical framework that combines norm-aligned training with directionally-pruned sampling. This approach maintains the robust omnidirectional supervision essential for stable flow learning, while eliminating initializations in data-sparse regions during inference. Importantly, our pruning strategy can be applied to any flow matching model trained with a Gaussian source, providing immediate performance gains without the need for retraining. Empirical evaluations demonstrate consistent improvements in both generation quality and sampling efficiency. Our findings provide practical insights and guidelines for source distribution design and introduce a readily applicable technique for improving existing flow matching models. Our code is available at https://github.com/kwanseokk/SourceFM.

Paper Structure

This paper contains 41 sections, 19 equations, 11 figures, 9 tables.

Figures (11)

  • Figure 1: 2D simulations of flow matching. The first two are examples of naive low-dimensional flow matching illustrations of (a) Gaussian to Gaussian and (b) 8 Gaussians to moons. Our proposed 2D simulation for flow matching is demonstrated in (c). Dots represent source (black), target data (blue), and generated samples (red). Blue dashed lines show successful ODE trajectories from source to generated samples, while red dashed lines indicate failed trajectories.
  • Figure 2: Visualization of flow matching with density-approximated source. (a) OT-CFM with Approximated Source (iter 200), (b) OT-CFM with Approximated Source (iter 6000), (c) OT-CFM with Approximated Source (iter 10000). "Norm. W" denotes normalized Wasserstein, where a lower value indicates better generation performance.
  • Figure 3: Visualization of flow matching with ideally aligned directional source. (a) OT-CFM with directional source, (b) OT-CFM with tight directional source, (c) global OT Pairing with directional source. 'Norm. W' denotes Normalized Wasserstein.
  • Figure 4: Visualization of flow matching with different methods. (a) I-CFM with Gaussian source (b) OT-CFM with Gaussian source (c) I-CFM with pruned source.
  • Figure 5: Visualization of flow trajectory heatmap. These heatmaps show the distribution of paths learned by each model. Color intensity corresponds to trajectory density, where brighter regions indicate higher concentrations of paths. (a) I-CFM and (b) OT-CFM.
  • ...and 6 more figures