Table of Contents
Fetching ...

Convergence of Continuous Normalizing Flows for Learning Probability Distributions

Yuan Gao, Jian Huang, Yuling Jiao, Shurong Zheng

TL;DR

This work establishes non-asymptotic Wasserstein-2 convergence guarantees for simulation-free continuous normalizing flows trained via flow matching. By focusing on CNFs with linear interpolation, it proves Lipschitz regularity of the velocity fields and develops $L^{\infty}$-type neural network approximation bounds that preserve Lipschitz properties. The analysis decomposes end-to-end distribution errors into discretization, velocity-estimation, and early-stopping components, and derives a nearly minimax nonparametric rate $\widetilde{\mathcal{O}}(n^{-1/(d+5)})$ for learning distributions from finite samples, up to polylog factors. The results also connect to broader themes in diffusion models and neural-network approximation theory, yielding practical guarantees for CNF-based distribution learning in high dimensions while highlighting key trade-offs arising from time-singular behavior near the terminal time.

Abstract

Continuous normalizing flows (CNFs) are a generative method for learning probability distributions, which is based on ordinary differential equations. This method has shown remarkable empirical success across various applications, including large-scale image synthesis, protein structure prediction, and molecule generation. In this work, we study the theoretical properties of CNFs with linear interpolation in learning probability distributions from a finite random sample, using a flow matching objective function. We establish non-asymptotic error bounds for the distribution estimator based on CNFs, in terms of the Wasserstein-2 distance. The key assumption in our analysis is that the target distribution satisfies one of the following three conditions: it either has a bounded support, is strongly log-concave, or is a finite or infinite mixture of Gaussian distributions. We present a convergence analysis framework that encompasses the error due to velocity estimation, the discretization error, and the early stopping error. A key step in our analysis involves establishing the regularity properties of the velocity field and its estimator for CNFs constructed with linear interpolation. This necessitates the development of uniform error bounds with Lipschitz regularity control of deep ReLU networks that approximate the Lipschitz function class, which could be of independent interest. Our nonparametric convergence analysis offers theoretical guarantees for using CNFs to learn probability distributions from a finite random sample.

Convergence of Continuous Normalizing Flows for Learning Probability Distributions

TL;DR

This work establishes non-asymptotic Wasserstein-2 convergence guarantees for simulation-free continuous normalizing flows trained via flow matching. By focusing on CNFs with linear interpolation, it proves Lipschitz regularity of the velocity fields and develops -type neural network approximation bounds that preserve Lipschitz properties. The analysis decomposes end-to-end distribution errors into discretization, velocity-estimation, and early-stopping components, and derives a nearly minimax nonparametric rate for learning distributions from finite samples, up to polylog factors. The results also connect to broader themes in diffusion models and neural-network approximation theory, yielding practical guarantees for CNF-based distribution learning in high dimensions while highlighting key trade-offs arising from time-singular behavior near the terminal time.

Abstract

Continuous normalizing flows (CNFs) are a generative method for learning probability distributions, which is based on ordinary differential equations. This method has shown remarkable empirical success across various applications, including large-scale image synthesis, protein structure prediction, and molecule generation. In this work, we study the theoretical properties of CNFs with linear interpolation in learning probability distributions from a finite random sample, using a flow matching objective function. We establish non-asymptotic error bounds for the distribution estimator based on CNFs, in terms of the Wasserstein-2 distance. The key assumption in our analysis is that the target distribution satisfies one of the following three conditions: it either has a bounded support, is strongly log-concave, or is a finite or infinite mixture of Gaussian distributions. We present a convergence analysis framework that encompasses the error due to velocity estimation, the discretization error, and the early stopping error. A key step in our analysis involves establishing the regularity properties of the velocity field and its estimator for CNFs constructed with linear interpolation. This necessitates the development of uniform error bounds with Lipschitz regularity control of deep ReLU networks that approximate the Lipschitz function class, which could be of independent interest. Our nonparametric convergence analysis offers theoretical guarantees for using CNFs to learn probability distributions from a finite random sample.
Paper Structure (45 sections, 53 theorems, 259 equations, 3 figures, 3 tables)

This paper contains 45 sections, 53 theorems, 259 equations, 3 figures, 3 tables.

Key Result

Theorem 1.1

Assume that the target distribution either has a bounded support, is strongly log-concave, or is a mixture of Gaussians. Let $0 < \underline{t} \ll 1$. The velocity fields of the CNFs with linear interpolation have the following regularity properties:

Figures (3)

  • Figure 1: Functions $g_1$ and $g_2$ for defining a partition of unity.
  • Figure 2: Functions $\phi_m(t)$ and $\phi_{m+1}(t)$ for defining a partition of unity.
  • Figure 3: The clipping function $\beta_A$.

Theorems & Definitions (114)

  • Theorem 1.1: Informal
  • Remark 1
  • Theorem 1.2: Informal
  • Remark 2
  • Definition 2.1: cattiaux2014semi
  • Definition 2.2: eldan2018regularization
  • Definition 2.3: Deep ReLU networks
  • Definition 2.4: Wasserstein-$2$ distance
  • Remark 3: Distribution classes
  • Remark 4: Score Lipschitzness
  • ...and 104 more