Table of Contents
Fetching ...

Stochastic Sampling from Deterministic Flow Models

Saurabh Singh, Ian Fischer

TL;DR

The paper addresses the limitations of deterministic transport in Gaussian-flow-based methods by introducing a general theorem that turns an ODE into an infinite family of SDEs sharing the same marginals. This enables stochastic samplers that can be tuned at sampling time to trade determinism for diversity and robustness, without retraining the underlying model. The authors derive specific corollaries for Gaussian flows, provide a practical score-imputation approach, and demonstrate improved performance on toy Gaussian tasks and large-scale ImageNet generation, including better FID scores and increased generation diversity under classifier-free guidance. This framework offers a flexible, post-training mechanism to bolster deterministic transport methods with stochasticity, improving reliability under discretization and enabling conditional sampling from intermediate states.

Abstract

Deterministic flow models, such as rectified flows, offer a general framework for learning a deterministic transport map between two distributions, realized as the vector field for an ordinary differential equation (ODE). However, they are sensitive to model estimation and discretization errors and do not permit different samples conditioned on an intermediate state, limiting their application. We present a general method to turn the underlying ODE of such flow models into a family of stochastic differential equations (SDEs) that have the same marginal distributions. This method permits us to derive families of \emph{stochastic samplers}, for fixed (e.g., previously trained) \emph{deterministic} flow models, that continuously span the spectrum of deterministic and stochastic sampling, given access to the flow field and the score function. Our method provides additional degrees of freedom that help alleviate the issues with the deterministic samplers and empirically outperforms them. We empirically demonstrate advantages of our method on a toy Gaussian setup and on the large scale ImageNet generation task. Further, our family of stochastic samplers provide an additional knob for controlling the diversity of generation, which we qualitatively demonstrate in our experiments.

Stochastic Sampling from Deterministic Flow Models

TL;DR

The paper addresses the limitations of deterministic transport in Gaussian-flow-based methods by introducing a general theorem that turns an ODE into an infinite family of SDEs sharing the same marginals. This enables stochastic samplers that can be tuned at sampling time to trade determinism for diversity and robustness, without retraining the underlying model. The authors derive specific corollaries for Gaussian flows, provide a practical score-imputation approach, and demonstrate improved performance on toy Gaussian tasks and large-scale ImageNet generation, including better FID scores and increased generation diversity under classifier-free guidance. This framework offers a flexible, post-training mechanism to bolster deterministic transport methods with stochasticity, improving reliability under discretization and enabling conditional sampling from intermediate states.

Abstract

Deterministic flow models, such as rectified flows, offer a general framework for learning a deterministic transport map between two distributions, realized as the vector field for an ordinary differential equation (ODE). However, they are sensitive to model estimation and discretization errors and do not permit different samples conditioned on an intermediate state, limiting their application. We present a general method to turn the underlying ODE of such flow models into a family of stochastic differential equations (SDEs) that have the same marginal distributions. This method permits us to derive families of \emph{stochastic samplers}, for fixed (e.g., previously trained) \emph{deterministic} flow models, that continuously span the spectrum of deterministic and stochastic sampling, given access to the flow field and the score function. Our method provides additional degrees of freedom that help alleviate the issues with the deterministic samplers and empirically outperforms them. We empirically demonstrate advantages of our method on a toy Gaussian setup and on the large scale ImageNet generation task. Further, our family of stochastic samplers provide an additional knob for controlling the diversity of generation, which we qualitatively demonstrate in our experiments.
Paper Structure (41 sections, 6 theorems, 57 equations, 12 figures, 2 tables)

This paper contains 41 sections, 6 theorems, 57 equations, 12 figures, 2 tables.

Key Result

Theorem 1

Let $p_t(x)$ be the probability density of the solutions of the SDE in eq:general_sde_paper evolving as $\frac{\partial p_t}{\partial t}$. Then, the probability density of solutions of the following set of SDEs, indexed by $\tilde{G}, \gamma_t$, also evolves as $\frac{\partial p_t}{\partial t}$. where and $\tilde{G} \equiv \tilde{G}(x, t), \gamma_t \ge 0$ are arbitrary functions such that the so

Figures (12)

  • Figure 1: Stochastic sampling improves diversity at all classifier-free guidance levels. We visualize samples from a rectified flow model at four classifier-free guidance levels $\lambda$ (\ref{['sec:score_function']}) and at four stochasticity scales $\alpha$ for NonSingular (\ref{['tab:sde_family_table']}). Three samples are shown for each configuration where the sampling starts at the same draw from $p_1(x_1)$. When $\alpha=0$, the sampler is deterministic and samples are the same (therefore we show only one). When $\lambda=0$, there is no classifier-free guidance. Note the increased diversity as $\alpha$ increases. More examples in \ref{['fig:qual_alpha_vs_lambda']}.
  • Figure 2: Discretization of deterministic flow leads to bias. Comparison of samplers from \ref{['tab:sde_family_table']} on the two Gaussian toy problem (\ref{['sec:toy_details']}). Deterministic underestimates the variance parameter, but the stochastic samplers avoid that issue, in exchange for variance in the parameter estimation. Singular's variance diverges if we start from $t=1$, so instead we start the sampler at $t=1-10^{-3}$, which allows it to eventually converge by $t=0$.
  • Figure 3: Stochasticity is most helpful at coarser discretizations. We visualize the effect of coarseness of discretization by sampling for 100 and 500 sampling steps. See \ref{['fig:toy_samplers']} for the same plots at 50 steps, which shows more extreme bias in variance for Deterministic and Singular.
  • Figure 4: Stochasticity helps mitigate bias. We plot the error in mean and error in variance for NonSingular for a set of diffusion coefficient scales $\alpha \in \{0.0, 0.5, 1.0, 1.5, 2.0, 2.5\}$. Estimates for variance at $t=0$ improve as $\alpha$ increases, leading to a drop in KL divergence from the true distribution. However, with very high $\alpha$ values intermediate marginals develop a bias.
  • Figure 5: Non-singular samplers work well over a broad range of $\alpha$. Plots of FID for each sampler as the diffusion coefficient scale $\alpha$ is increased. Note that at $\alpha=0$ all samplers coincide. See \ref{['fig:imagenet_fid_alpha_uncropped']} for a larger range of FIDs.
  • ...and 7 more figures

Theorems & Definitions (9)

  • Theorem 1
  • Corollary 1.1
  • Corollary 1.2
  • Theorem 1
  • proof
  • Corollary 1.2
  • proof
  • Corollary 1.2
  • proof