Table of Contents
Fetching ...

COT-FM: Cluster-wise Optimal Transport Flow Matching

Chiensheng Chiang, Kuan-Hsun Tu, Jia-Wei Liao, Cheng-Fu Chou, Tsung-Wei Ke

Abstract

We introduce COT-FM, a general framework that reshapes the probability path in Flow Matching (FM) to achieve faster and more reliable generation. FM models often produce curved trajectories due to random or batchwise couplings, which increase discretization error and reduce sample quality. COT-FM fixes this by clustering target samples and assigning each cluster a dedicated source distribution obtained by reversing pretrained FM models. This divide-and-conquer strategy yields more accurate local transport and significantly straighter vector fields, all without changing the model architecture. As a plug-and-play approach, COT-FM consistently accelerates sampling and improves generation quality across 2D datasets, image generation benchmarks, and robotic manipulation tasks.

COT-FM: Cluster-wise Optimal Transport Flow Matching

Abstract

We introduce COT-FM, a general framework that reshapes the probability path in Flow Matching (FM) to achieve faster and more reliable generation. FM models often produce curved trajectories due to random or batchwise couplings, which increase discretization error and reduce sample quality. COT-FM fixes this by clustering target samples and assigning each cluster a dedicated source distribution obtained by reversing pretrained FM models. This divide-and-conquer strategy yields more accurate local transport and significantly straighter vector fields, all without changing the model architecture. As a plug-and-play approach, COT-FM consistently accelerates sampling and improves generation quality across 2D datasets, image generation benchmarks, and robotic manipulation tasks.
Paper Structure (42 sections, 13 equations, 11 figures, 13 tables, 6 algorithms)

This paper contains 42 sections, 13 equations, 11 figures, 13 tables, 6 algorithms.

Figures (11)

  • Figure 1: COT-FM Yields Straight, Structure-Preserving Transport Flows. Blue points show the source distribution (Gaussian), and gray points show the target distributions (a 5-component Gaussian mixture and Two Moons). Orange points are generated samples, and purple lines denote the learned transport trajectories. Red crosses mark cluster means under our Cluster-wise Optimal Transport Flow Matching (COT-FM). The method yields straight trajectories while still capturing the structure of each target distribution.
  • Figure 2: Vector field results from different coupling strategies. Random coupling forces the model to regress inconsistent (gray) velocity targets, creating ambiguous intersections and pushing the model toward an averaged (purple) direction, which produces curved velocity fields. In contrast, optimal transport provides consistent couplings with fewer intersections, enabling the model to learn much straighter velocity fields.
  • Figure 3: Overview of our proposed method. In Stage 1, we cluster the dataset, and for each image within a cluster $\mathcal{C}_k$, we reverse the ODE back to the noise space using a pretrained flow model to compute the new mean $\bm{\mu}_{0,k}$ and covariance $\bm{\Sigma}_{0,k}$ for that cluster. In Stage 2, we apply optimal transport to finetune the pretrained flow model across all clusters, encouraging it to align with the distributional structure captured in Stage 1. During inference, we first sample a cluster index $k$, then draw a noise $\bm{x}$ from $\mathcal{N}(\bm{\mu}_{0,k}, \bm{\Sigma}_{0,k})$, and use this noise to generate an image through the finetuned flow model.
  • Figure 4: Visualization of 2D checkerboard data under different methods. Blue points denote the source distribution, gray points the target distribution, and orange points the generated samples.
  • Figure 4: Robotic manipulation on LIBERO benchmarks liu2023libero. COT-FM achieves single-step performance comparable to FLOWER's 4-step generation while outperforming other single-step baselines. Measured in success rate (higher is better).
  • ...and 6 more figures