Towards Hierarchical Rectified Flow
Yichi Zhang, Yici Yan, Alex Schwing, Zhizhen Zhao
TL;DR
This work addresses the limitation of classic rectified flow in modeling multimodal velocity fields by introducing Hierarchical Rectified Flow (HRF), which learns acceleration in velocity space through hierarchically coupled ODEs to capture the full velocity distribution. Sampling proceeds via a two‑stage process: first draw a velocity sample from the learned $\pi_1(v; x_t,t)$ through forward acceleration and then update the location, yielding straighter, potentially intersecting paths and reducing neural function evaluations. The authors derive analytical forms for velocity distributions in Gaussian mixtures, propose an acceleration‑matching training objective, and extend the framework to depth $D$ to capture higher‑order dynamics. Empirical results on synthetic 1D/2D data and real image datasets (MNIST, CIFAR‑10, ImageNet‑32) show improved data fit (lower WD/SWD/FID) at similar NFEs, with HRF2 often outperforming the baseline RF, and code is released for reproducibility.
Abstract
We formulate a hierarchical rectified flow to model data distributions. It hierarchically couples multiple ordinary differential equations (ODEs) and defines a time-differentiable stochastic process that generates a data distribution from a known source distribution. Each ODE resembles the ODE that is solved in a classic rectified flow, but differs in its domain, i.e., location, velocity, acceleration, etc. Unlike the classic rectified flow formulation, which formulates a single ODE in the location domain and only captures the expected velocity field (sufficient to capture a multi-modal data distribution), the hierarchical rectified flow formulation models the multi-modal random velocity field, acceleration field, etc., in their entirety. This more faithful modeling of the random velocity field enables integration paths to intersect when the underlying ODE is solved during data generation. Intersecting paths in turn lead to integration trajectories that are more straight than those obtained in the classic rectified flow formulation, where integration paths cannot intersect. This leads to modeling of data distributions with fewer neural function evaluations. We empirically verify this on synthetic 1D and 2D data as well as MNIST, CIFAR-10, and ImageNet-32 data. Our code is available at: https://riccizz.github.io/HRF/.
