Table of Contents
Fetching ...

Beyond Optimal Transport: Model-Aligned Coupling for Flow Matching

Yexiong Lin, Yu Yao, Tongliang Liu

TL;DR

This paper addresses the inefficiency of Flow Matching (FM) caused by random couplings that generate crossing and curved transport trajectories. It introduces Model-Aligned Coupling (MAC), which selects training couplings based on the model's current prediction error to align supervision with learnable transport directions, implemented via a top-$k$ sampling strategy that adapts during training. MAC can be deployed as a regularization across FM variants, including Shortcut Models, and can be extended to full coupling optimization (MAC-full) using Sinkhorn-type assignment. Empirical results on MNIST, CIFAR-10, and CelebA-HQ-256 show that MAC substantially improves one-step and few-step generation quality while reducing the required number of integration steps, with robustness to hyperparameters. Overall, MAC demonstrates that coupling strategies informed by model capacity can significantly enhance the efficiency and fidelity of flow-based generative models.

Abstract

Flow Matching (FM) is an effective framework for training a model to learn a vector field that transports samples from a source distribution to a target distribution. To train the model, early FM methods use random couplings, which often result in crossing paths and lead the model to learn non-straight trajectories that require many integration steps to generate high-quality samples. To address this, recent methods adopt Optimal Transport (OT) to construct couplings by minimizing geometric distances, which helps reduce path crossings. However, we observe that such geometry-based couplings do not necessarily align with the model's preferred trajectories, making it difficult to learn the vector field induced by these couplings, which prevents the model from learning straight trajectories. Motivated by this, we propose Model-Aligned Coupling (MAC), an effective method that matches training couplings based not only on geometric distance but also on alignment with the model's preferred transport directions based on its prediction error. To avoid the time-costly match process, MAC proposes to select the top-$k$ fraction of couplings with the lowest error for training. Extensive experiments show that MAC significantly improves generation quality and efficiency in few-step settings compared to existing methods. Project page: https://yexionglin.github.io/mac

Beyond Optimal Transport: Model-Aligned Coupling for Flow Matching

TL;DR

This paper addresses the inefficiency of Flow Matching (FM) caused by random couplings that generate crossing and curved transport trajectories. It introduces Model-Aligned Coupling (MAC), which selects training couplings based on the model's current prediction error to align supervision with learnable transport directions, implemented via a top- sampling strategy that adapts during training. MAC can be deployed as a regularization across FM variants, including Shortcut Models, and can be extended to full coupling optimization (MAC-full) using Sinkhorn-type assignment. Empirical results on MNIST, CIFAR-10, and CelebA-HQ-256 show that MAC substantially improves one-step and few-step generation quality while reducing the required number of integration steps, with robustness to hyperparameters. Overall, MAC demonstrates that coupling strategies informed by model capacity can significantly enhance the efficiency and fidelity of flow-based generative models.

Abstract

Flow Matching (FM) is an effective framework for training a model to learn a vector field that transports samples from a source distribution to a target distribution. To train the model, early FM methods use random couplings, which often result in crossing paths and lead the model to learn non-straight trajectories that require many integration steps to generate high-quality samples. To address this, recent methods adopt Optimal Transport (OT) to construct couplings by minimizing geometric distances, which helps reduce path crossings. However, we observe that such geometry-based couplings do not necessarily align with the model's preferred trajectories, making it difficult to learn the vector field induced by these couplings, which prevents the model from learning straight trajectories. Motivated by this, we propose Model-Aligned Coupling (MAC), an effective method that matches training couplings based not only on geometric distance but also on alignment with the model's preferred transport directions based on its prediction error. To avoid the time-costly match process, MAC proposes to select the top- fraction of couplings with the lowest error for training. Extensive experiments show that MAC significantly improves generation quality and efficiency in few-step settings compared to existing methods. Project page: https://yexionglin.github.io/mac

Paper Structure

This paper contains 33 sections, 13 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: SubFig. (a–c) illustrate coupling strategies between a source distribution and a target distribution, defined respectively as a Gaussian mixture with four components and a Gaussian mixture with two components. Both follow the form $p(x) = \sum_{i=1}^K \pi_i \mathcal{N}(x \mid \mu_i, I)$. We compare Random Coupling, Optimal Transport (OT), and our proposed Model-Aligned Coupling (MAC). While OT improves over Random by reducing path crossings, it may still induce local ambiguity (e.g., multiple directions at $t=0$). In contrast, MAC selects couplings that better align with the model's learned vector field. SubFig. (c-d) show one-step generated samples for different models. MAC yields significantly better sample quality with fewer integration steps compared to standard FM, OT-FM, and Shortcut models.
  • Figure 2: Sensitivity analysis on MNIST for hyperparameters $k$, $r$, and $\lambda$.
  • Figure 3: Qualitative comparison on CelebA-HQ-256 under the four-step generation setting. Compared to Shortcut Models, MAC generates images with more realistic facial details, smoother textures, and better global structure.
  • Figure 4: Qualitative comparison of generated MNIST samples across different models under varying sampling steps (1, 4, and 128 steps). MAC-full generates sharper digits, especially in the one-step and four-step settings.
  • Figure 5: Qualitative comparison of generated CIFAR-10 samples across different models under varying sampling steps (1, 4, and 128 steps). MAC-full generates more realistic and structurally consistent objects, especially in the one-step and four-step settings.
  • ...and 1 more figures