Table of Contents
Fetching ...

Stochastic interpolants with data-dependent couplings

Michael S. Albergo, Mark Goldstein, Nicholas M. Boffi, Rajesh Ranganath, Eric Vanden-Eijnden

TL;DR

The paper generalizes stochastic interpolants by introducing data-dependent couplings between a base and a target density, enabling conditional generative modeling that leverages information beyond plain conditioning. It formalizes a transport framework where the evolving density obeys a continuity equation and demonstrates that the optimal velocity field and score can be learned via simple quadratic losses, analogous to standard flow-based methods. By designing couplings that tie the base to the target (and optionally conditioning on auxiliary information), the authors show reduced transport costs and improved sample fidelity in tasks like image inpainting and super-resolution. Empirically, data-dependent couplings yield measurable gains in FID and visual quality on ImageNet-based tasks, with no inference-time corrections required, highlighting practical impact for conditional image synthesis and restoration. The work suggests broad applicability of coupled base distributions in generative modeling and points to future directions in scientific domains and complex autoencoding scenarios.

Abstract

Generative models inspired by dynamical transport of measure -- such as flows and diffusions -- construct a continuous-time map between two probability densities. Conventionally, one of these is the target density, only accessible through samples, while the other is taken as a simple base density that is data-agnostic. In this work, using the framework of stochastic interpolants, we formalize how to \textit{couple} the base and the target densities, whereby samples from the base are computed conditionally given samples from the target in a way that is different from (but does preclude) incorporating information about class labels or continuous embeddings. This enables us to construct dynamical transport maps that serve as conditional generative models. We show that these transport maps can be learned by solving a simple square loss regression problem analogous to the standard independent setting. We demonstrate the usefulness of constructing dependent couplings in practice through experiments in super-resolution and in-painting.

Stochastic interpolants with data-dependent couplings

TL;DR

The paper generalizes stochastic interpolants by introducing data-dependent couplings between a base and a target density, enabling conditional generative modeling that leverages information beyond plain conditioning. It formalizes a transport framework where the evolving density obeys a continuity equation and demonstrates that the optimal velocity field and score can be learned via simple quadratic losses, analogous to standard flow-based methods. By designing couplings that tie the base to the target (and optionally conditioning on auxiliary information), the authors show reduced transport costs and improved sample fidelity in tasks like image inpainting and super-resolution. Empirically, data-dependent couplings yield measurable gains in FID and visual quality on ImageNet-based tasks, with no inference-time corrections required, highlighting practical impact for conditional image synthesis and restoration. The work suggests broad applicability of coupled base distributions in generative modeling and points to future directions in scientific domains and complex autoencoding scenarios.

Abstract

Generative models inspired by dynamical transport of measure -- such as flows and diffusions -- construct a continuous-time map between two probability densities. Conventionally, one of these is the target density, only accessible through samples, while the other is taken as a simple base density that is data-agnostic. In this work, using the framework of stochastic interpolants, we formalize how to \textit{couple} the base and the target densities, whereby samples from the base are computed conditionally given samples from the target in a way that is different from (but does preclude) incorporating information about class labels or continuous embeddings. This enables us to construct dynamical transport maps that serve as conditional generative models. We show that these transport maps can be learned by solving a simple square loss regression problem analogous to the standard independent setting. We demonstrate the usefulness of constructing dependent couplings in practice through experiments in super-resolution and in-painting.
Paper Structure (24 sections, 6 theorems, 44 equations, 7 figures, 3 tables, 2 algorithms)

This paper contains 24 sections, 6 theorems, 44 equations, 7 figures, 3 tables, 2 algorithms.

Key Result

Theorem 3.1

The probability distribution of the stochastic interpolant $I_t$ defined in eq:stochinterpolant has a density $\rho_t(x)$ that satisfies $\rho_{t=0}(x)= \rho_0(x)$ and $\rho_{t=1}(x) = \rho_1(x)$, and solves the transport equation where the velocity field $b_t(x)$ is defined in eq:g:c. Moreover, for every $t$ such that $\gamma_t \neq 0$, the following identity for the score holds Finally, the f

Figures (7)

  • Figure 1: Examples. Super-resolution and in-painting results computed with our formalism.
  • Figure 2: Data-dependent couplings are different than conditioning. Delineating between constructing couplings versus conditioning the velocity field, and their implications for the corresponding probability flow $X_t$. The transport problem is flowing from a Gaussian Mixture Model (GMM) with 3 modes to another GMM with 3 modes. Left: The probability flow $X_t$ arising from the data-dependent coupling $\rho(x_0, x_1) = \rho_1(x_1)\rho_0(x_0 | x_1)$. All samples follow simple trajectories. No formation of auxiliary modes form in the intermediate density $\rho(t)$, in juxtaposition to the independent case. Center: When the velocity field is conditioned $b_t(x,\xi)$ on each class (mode), it factorizes, resulting in three separate probability flows $X_t^{\xi}$ with $\xi = 1,2,3$. Right: The probability flow $X_t$ when taking an unconditional velocity field $b_t(x)$ and an independent coupling $\rho(x_0, x_1) = \rho_0(x_0)\rho_1(x_1)$. Note the complexity of the underlying transport, which motivates us to consider finding correlated base variables directly in the data.
  • Figure 3: Image inpainting: ImageNet-$256\times 256$ and ImageNet-$512\times 512$.Top panels: Six examples of image in-filling at resolution $256\times 256$, where the left columns display masked images, the center corresponds to in-filled model samples, and the right shows full reference images. The aims are not to recover the precise content of the reference image, but instead, to provide a conditionally valid in-filling. Bottom panels: Four examples at resolution $512\times 512$.
  • Figure 4: Super-resolution:Top four rows: Super-resolved images from resolution $64\times 64 \mapsto 256\times 256$, where the left-most image is the lower resolution version, the middle is the model output, and the right is the ground truth. Examples for $256\times 256 \mapsto 512\times 512$ are given in \ref{['fig:super512']}.
  • Figure 5: Additional examples of in-filling on the $256\times256$ resolution images, with temporal slices of the probability flow.
  • ...and 2 more figures

Theorems & Definitions (12)

  • Definition 3.1: Stochastic interpolant with coupling
  • Remark 3.1: Incorporating conditioning
  • Theorem 3.1: Transport equation with coupling
  • Corollary 3.1: Probability flow and diffusions with coupling
  • Proposition 3.1: Control of transport cost
  • Definition 1.1: Stochastic interpolant with coupling and conditioning
  • Theorem 1.1: Transport equation with coupling and conditioning
  • proof
  • Corollary 1.1: Probability flow and diffusions with coupling and conditioning
  • proof
  • ...and 2 more