Table of Contents
Fetching ...

Simulation-free Schrödinger bridges via score and flow matching

Alexander Tong, Nikolay Malkin, Kilian Fatras, Lazar Atanackovic, Yanlei Zhang, Guillaume Huguet, Guy Wolf, Yoshua Bengio

TL;DR

This work introduces simulation-free score and flow matching ([SF]$^2$M) to infer continuous-time stochastic dynamics between arbitrary source and target distributions by framing the problem as a Schrödinger bridge solvable through entropic optimal transport. By learning conditional drift and score functions via a stochastic regression objective and leveraging Brownian-bridge conditioning, SF^2M constructs the Schrödinger bridge without simulating trajectories, and allows inference with arbitrary diffusion schedules. The approach unifies score-based diffusion and continuous normalizing flows, enabling scalable, high-dimensional modeling of complex systems, including single-cell development and gene regulatory networks, while providing theoretical guarantees and empirical evidence of competitive performance. The method is demonstrated on synthetic benchmarks and real-world cellular data, showing accurate SB recovery, effective high-dimensional cell dynamics modeling, and the ability to recover GRNs, with available code for reproducibility.

Abstract

We present simulation-free score and flow matching ([SF]$^2$M), a simulation-free objective for inferring stochastic dynamics given unpaired samples drawn from arbitrary source and target distributions. Our method generalizes both the score-matching loss used in the training of diffusion models and the recently proposed flow matching loss used in the training of continuous normalizing flows. [SF]$^2$M interprets continuous-time stochastic generative modeling as a Schrödinger bridge problem. It relies on static entropy-regularized optimal transport, or a minibatch approximation, to efficiently learn the SB without simulating the learned stochastic process. We find that [SF]$^2$M is more efficient and gives more accurate solutions to the SB problem than simulation-based methods from prior work. Finally, we apply [SF]$^2$M to the problem of learning cell dynamics from snapshot data. Notably, [SF]$^2$M is the first method to accurately model cell dynamics in high dimensions and can recover known gene regulatory networks from simulated data. Our code is available in the TorchCFM package at https://github.com/atong01/conditional-flow-matching.

Simulation-free Schrödinger bridges via score and flow matching

TL;DR

This work introduces simulation-free score and flow matching ([SF]M) to infer continuous-time stochastic dynamics between arbitrary source and target distributions by framing the problem as a Schrödinger bridge solvable through entropic optimal transport. By learning conditional drift and score functions via a stochastic regression objective and leveraging Brownian-bridge conditioning, SF^2M constructs the Schrödinger bridge without simulating trajectories, and allows inference with arbitrary diffusion schedules. The approach unifies score-based diffusion and continuous normalizing flows, enabling scalable, high-dimensional modeling of complex systems, including single-cell development and gene regulatory networks, while providing theoretical guarantees and empirical evidence of competitive performance. The method is demonstrated on synthetic benchmarks and real-world cellular data, showing accurate SB recovery, effective high-dimensional cell dynamics modeling, and the ability to recover GRNs, with available code for reproducibility.

Abstract

We present simulation-free score and flow matching ([SF]M), a simulation-free objective for inferring stochastic dynamics given unpaired samples drawn from arbitrary source and target distributions. Our method generalizes both the score-matching loss used in the training of diffusion models and the recently proposed flow matching loss used in the training of continuous normalizing flows. [SF]M interprets continuous-time stochastic generative modeling as a Schrödinger bridge problem. It relies on static entropy-regularized optimal transport, or a minibatch approximation, to efficiently learn the SB without simulating the learned stochastic process. We find that [SF]M is more efficient and gives more accurate solutions to the SB problem than simulation-based methods from prior work. Finally, we apply [SF]M to the problem of learning cell dynamics from snapshot data. Notably, [SF]M is the first method to accurately model cell dynamics in high dimensions and can recover known gene regulatory networks from simulated data. Our code is available in the TorchCFM package at https://github.com/atong01/conditional-flow-matching.
Paper Structure (58 sections, 3 theorems, 37 equations, 7 figures, 9 tables, 5 algorithms)

This paper contains 58 sections, 3 theorems, 37 equations, 7 figures, 9 tables, 5 algorithms.

Key Result

Proposition 2.1

Let the reference process be a Brownian motion (i.e.,$\mathbb{Q} = \sigma \mathbb{W}$). Then the Schrödinger bridge problem admits a unique solution ${\mathbb{P}}^*$ having the form of a mixture of Brownian bridges weighted by an entropic OT plan: where ${\mathbb{Q}}((x_t)_{t\in(0,1)}\mid x_0,x_1)$ is the Brownian bridge between $x_0$ and $x_1$ with diffusion rate $\sigma$.

Figures (7)

  • Figure 1: Left: ODE and SDE paths from 8gaussians to moons, sampled from a model trained using [SF]$^2$M. [SF]$^2$M makes it possible to vary the diffusion schedule at inference time and thus interpolate between ODEs and SDEs that have the same marginal densities. Right: Illustration of the stochastic regression objective in [SF]$^2$M. Given a source point $x_0$ and target point $x_1$ sampled from an entropic OT plan between marginals, an intermediate point $x_t$ is sampled from the Brownian bridge (marginal in light blue) in a simulation-free way. Neural networks are regressed to the ODE drift $u^\circ_t(x_t|x_0,x_1)$ and to the conditional score $\nabla\log p_t(x_t|x_0,x_1)$. The regression objective is stochastic, as the same point $x_t$ may appear on different conditional paths, e.g., the dotted path from $x_0'$ to $x_1'$. The stochastic regression recovers dynamics that transform the marginal at time 0 to that at time 1.
  • Figure 2: Visualization of learned Waddington's landscape $W$ with a bifurcating trajectory for one Gaussian to two Gaussians (left) and for the Embryoid Body (EB) data moon_visualizing_2019 (right). The dimensions are space (left-right), time (forward-back), and potential (up-down).
  • Figure 3: Simulation of trajectories from a given cell on 2D EB data. Left: Probability flow ODE trajectory, approximated by SB-CFM tong_conditional_2023. Right: Five SDE trajectories from [SF]$^2$M; more target samples (20) in blue.
  • Figure 4: Optimal transport couplings for different OT costs and batch sizes on a 2D example. The top row represent the OT matching between samples while the bottom row represent the minibatch OT plan. We can see that coupling entropic OT with minibatches lead to a uniform plan contrary to using only entropic regularization or minibatch approximation.
  • Figure 5: Learned ODE (top) and SDE (bottom) simulations for $\sigma \in [0.1, 1, 2, 3]$ from left to right for trained [SF]$^2$M model. The marginals match regardless of the chosen $\sigma$. Trajectory initializations are matched between runs.
  • ...and 2 more figures

Theorems & Definitions (5)

  • Proposition 2.1: Follmer1988
  • Definition A.1: minibatch transport plan fatras_learning_2020
  • Lemma E.1: Brownian bridge with time-varying diffusion
  • proof
  • Corollary E.2