On the minimax optimality of Flow Matching through the connection to kernel density estimation
Lea Kunkel, Mathias Trabs
TL;DR
This work analyzes Flow Matching, a continuous-normalizing-flow approach to generative modeling, through its connection to kernel density estimation (KDE). It shows that the KDE estimator achieves the minimax rate in Wasserstein distance up to logarithmic factors, and that Flow Matching attains the same rate for sufficiently large neural-vector-field networks, providing a solid theoretical basis for its practical success. Moreover, the authors prove that when the target distribution concentrates on a linear subspace, the convergence rates depend on the intrinsic dimension rather than the ambient dimension, offering a first justification for Flow Matching's strong performance in high-dimensional settings. The results establish a principled statistical theory for Flow Matching, highlighting the roles of kernel choice, network approximation, and intrinsic dimensionality in achieving optimal rates.
Abstract
Flow Matching has recently gained attention in generative modeling as a simple and flexible alternative to diffusion models, the current state of the art. While existing statistical guarantees adapt tools from the analysis of diffusion models, we take a different perspective by connecting Flow Matching to kernel density estimation. We first verify that the kernel density estimator matches the optimal rate of convergence in Wasserstein distance up to logarithmic factors, improving existing bounds for the Gaussian kernel. Based on this result, we prove that for sufficiently large networks, Flow Matching also achieves the optimal rate up to logarithmic factors, providing a theoretical foundation for the empirical success of this method. Finally, we provide a first justification of Flow Matching's effectiveness in high-dimensional settings by showing that rates improve when the target distribution lies on a lower-dimensional linear subspace.
