Table of Contents
Fetching ...

CurveFlow: Curvature-Guided Flow Matching for Image Generation

Yan Luo, Drake Du, Hao Huang, Yi Fang, Mengyu Wang

TL;DR

Rectified Flow's zero-curvature linear trajectories can misalign image generation with complex textual prompts. CurveFlow introduces curvature-guided, non-linear trajectories parameterized by $z_t = a_\phi(t) x_0 + b_\psi(t) \epsilon$ and a robust curvature regularization to stabilize learning. Empirical results on MS COCO 2014/2017 show state-of-the-art text-to-image performance with improved semantic metrics (BLEU, METEOR, ROUGE, CLAIR) and strong image quality (FID), confirming that curvature-aware flow enhances instruction compliance. The approach offers an efficient, geometry-aware alternative to diffusion, enabling more faithful and detailed image synthesis guided by complex text.

Abstract

Existing rectified flow models are based on linear trajectories between data and noise distributions. This linearity enforces zero curvature, which can inadvertently force the image generation process through low-probability regions of the data manifold. A key question remains underexplored: how does the curvature of these trajectories correlate with the semantic alignment between generated images and their corresponding captions, i.e., instructional compliance? To address this, we introduce CurveFlow, a novel flow matching framework designed to learn smooth, non-linear trajectories by directly incorporating curvature guidance into the flow path. Our method features a robust curvature regularization technique that penalizes abrupt changes in the trajectory's intrinsic dynamics.Extensive experiments on MS COCO 2014 and 2017 demonstrate that CurveFlow achieves state-of-the-art performance in text-to-image generation, significantly outperforming both standard rectified flow variants and other non-linear baselines like Rectified Diffusion. The improvements are especially evident in semantic consistency metrics such as BLEU, METEOR, ROUGE, and CLAIR. This confirms that our curvature-aware modeling substantially enhances the model's ability to faithfully follow complex instructions while simultaneously maintaining high image quality. The code is made publicly available at https://github.com/Harvard-AI-and-Robotics-Lab/CurveFlow.

CurveFlow: Curvature-Guided Flow Matching for Image Generation

TL;DR

Rectified Flow's zero-curvature linear trajectories can misalign image generation with complex textual prompts. CurveFlow introduces curvature-guided, non-linear trajectories parameterized by and a robust curvature regularization to stabilize learning. Empirical results on MS COCO 2014/2017 show state-of-the-art text-to-image performance with improved semantic metrics (BLEU, METEOR, ROUGE, CLAIR) and strong image quality (FID), confirming that curvature-aware flow enhances instruction compliance. The approach offers an efficient, geometry-aware alternative to diffusion, enabling more faithful and detailed image synthesis guided by complex text.

Abstract

Existing rectified flow models are based on linear trajectories between data and noise distributions. This linearity enforces zero curvature, which can inadvertently force the image generation process through low-probability regions of the data manifold. A key question remains underexplored: how does the curvature of these trajectories correlate with the semantic alignment between generated images and their corresponding captions, i.e., instructional compliance? To address this, we introduce CurveFlow, a novel flow matching framework designed to learn smooth, non-linear trajectories by directly incorporating curvature guidance into the flow path. Our method features a robust curvature regularization technique that penalizes abrupt changes in the trajectory's intrinsic dynamics.Extensive experiments on MS COCO 2014 and 2017 demonstrate that CurveFlow achieves state-of-the-art performance in text-to-image generation, significantly outperforming both standard rectified flow variants and other non-linear baselines like Rectified Diffusion. The improvements are especially evident in semantic consistency metrics such as BLEU, METEOR, ROUGE, and CLAIR. This confirms that our curvature-aware modeling substantially enhances the model's ability to faithfully follow complex instructions while simultaneously maintaining high image quality. The code is made publicly available at https://github.com/Harvard-AI-and-Robotics-Lab/CurveFlow.

Paper Structure

This paper contains 8 sections, 12 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Limitations of linear trajectory hypothesis in rectified flows (RFs). Rectified flow's linear trajectory hypothesis could break down when modeling complex data distributions. Real data typically resides on non-linear manifolds (blue curve), while rectified flows enforce linear paths (red dashed line) that deviate from the optimal transport path (green), leading to generation artifacts. As a result, the images generated by rectified flows may not accurately capture the semantics of the prompts.
  • Figure 2: Schematic view of CurveFlow. The diagram illustrates how CurveFlow establishes curve trajectories between data samples ($x_0$) and Gaussian noise ($\epsilon$), contrasting with the linear paths of Rectified Flow (dashed line). The trajectory is defined by $z_t = a_\phi(t)x_0 + b_\psi(t)\epsilon$, where curvature $\kappa(t)$ measures the deviation from linearity. Key innovations include: (1) a curvature-aware training objective $\mathcal{L}_{\text{Curve-FM}}$ that aligns the velocity field with curve paths, and (2) curvature regularization $\mathcal{L}_{\text{curvature}}$ that controls path geometry. These components together enable CurveFlow to handle non-linear data transitions while retaining simulation-free training advantages.
  • Figure 3: Qualitative comparison showing that CurveFlow generates captions with superior semantic accuracy and detail alignment compared to RF variants.
  • Figure 4: Impact of $\lambda$ (left) and the dimensions of coefficient networks (right) on FID and METEOR.