FlowConsist: Make Your Flow Consistent with Real Trajectory
Tianyi Zhang, Chengcheng Liu, Jinwei Chen, Chun-Le Guo, Chongyi Li, Ming-Ming Cheng, Bo Li, Peng-Tao Jiang
TL;DR
FlowConsist addresses two core problems in fast-flow generative models: trajectory drift caused by using conditional velocities and the accumulation of approximation errors along long trajectories. It replaces conditional velocities with the model’s marginal velocity to enforce a single consistent ODE path and adds a trajectory rectification mechanism that aligns generated marginals with real data at every step, including a marginal-velocity alignment via an auxiliary predictor. The approach achieves state-of-the-art 1-NFE results on ImageNet 256×256 (FID 1.52), demonstrating substantial gains over prior fast-flow methods and competitive performance with multi-step diffusion. The work provides both theoretical analysis and a practical training framework to enhance fast-flow models without teacher distillation, with broad applicability across architectures.
Abstract
Fast flow models accelerate the iterative sampling process by learning to directly predict ODE path integrals, enabling one-step or few-step generation. However, we argue that current fast-flow training paradigms suffer from two fundamental issues. First, conditional velocities constructed from randomly paired noise-data samples introduce systematic trajectory drift, preventing models from following a consistent ODE path. Second, the model's approximation errors accumulate over time steps, leading to severe deviations across long time intervals. To address these issues, we propose FlowConsist, a training framework designed to enforce trajectory consistency in fast flows. We propose a principled alternative that replaces conditional velocities with the marginal velocities predicted by the model itself, aligning optimization with the true trajectory. To further address error accumulation over time steps, we introduce a trajectory rectification strategy that aligns the marginal distributions of generated and real samples at every time step along the trajectory. Our method establishes a new state-of-the-art on ImageNet 256$\times$256, achieving an FID of 1.52 with only 1 sampling step.
