Efficiency vs. Fidelity: A Comparative Analysis of Diffusion Probabilistic Models and Flow Matching on Low-Resource Hardware
Srishti Gupta, Yashasvee Taiwade
TL;DR
The paper compares Diffusion Probabilistic Models (DDPMs) and Flow Matching on low-resource hardware using a Time-Conditioned U-Net trained on MNIST, revealing that Flow Matching achieves substantially higher efficiency. It demonstrates that Flow Matching learns a near-rectified transport path with curvature $\mathcal{C} \approx 1.02$, versus Diffusion trajectories with $\mathcal{C}$ up to $3.45$, enabling high fidelity with as few as $N=10$ function evaluations. An Euler solver suffices for Flow Matching due to the linearity of the learned vector field, enabling latencies around $1.8$ ms on a constrained $NVIDIA\,\text{T4}$ and up to $10\times$ fewer evaluations than diffusion. These results establish Flow Matching as the practical choice for real-time, edge-friendly generative tasks, with a clear efficiency frontier and guidance for deployment on limited hardware.
Abstract
Denoising Diffusion Probabilistic Models (DDPMs) have established a new state-of-the-art in generative image synthesis, yet their deployment is hindered by significant computational overhead during inference, often requiring up to 1,000 iterative steps. This study presents a rigorous comparative analysis of DDPMs against the emerging Flow Matching (Rectified Flow) paradigm, specifically isolating their geometric and efficiency properties on low-resource hardware. By implementing both frameworks on a shared Time-Conditioned U-Net backbone using the MNIST dataset, we demonstrate that Flow Matching significantly outperforms Diffusion in efficiency. Our geometric analysis reveals that Flow Matching learns a highly rectified transport path (Curvature $\mathcal{C} \approx 1.02$), which is near-optimal, whereas Diffusion trajectories remain stochastic and tortuous ($\mathcal{C} \approx 3.45$). Furthermore, we establish an ``efficiency frontier'' at $N=10$ function evaluations, where Flow Matching retains high fidelity while Diffusion collapses. Finally, we show via numerical sensitivity analysis that the learned vector field is sufficiently linear to render high-order ODE solvers (Runge-Kutta 4) unnecessary, validating the use of lightweight Euler solvers for edge deployment. \textbf{This work concludes that Flow Matching is the superior algorithmic choice for real-time, resource-constrained generative tasks.}
