Table of Contents
Fetching ...

Multitask Learning with Stochastic Interpolants

Hugo Negrel, Florentin Coeurdoux, Michael S. Albergo, Eric Vanden-Eijnden

TL;DR

This work generalizes stochastic interpolants by replacing the scalar time variable with vectors, matrices, or linear operators, allowing the construction of versatile generative models capable of fulfilling multiple tasks without task-specific training.

Abstract

We propose a framework for learning maps between probability distributions that broadly generalizes the time dynamics of flow and diffusion models. To enable this, we generalize stochastic interpolants by replacing the scalar time variable with vectors, matrices, or linear operators, allowing us to bridge probability distributions across multiple dimensional spaces. This approach enables the construction of versatile generative models capable of fulfilling multiple tasks without task-specific training. Our operator-based interpolants not only provide a unifying theoretical perspective for existing generative models but also extend their capabilities. Through numerical experiments, we demonstrate the zero-shot efficacy of our method on conditional generation and inpainting, fine-tuning and posterior sampling, and multiscale modeling, suggesting its potential as a generic task-agnostic alternative to specialized models.

Multitask Learning with Stochastic Interpolants

TL;DR

This work generalizes stochastic interpolants by replacing the scalar time variable with vectors, matrices, or linear operators, allowing the construction of versatile generative models capable of fulfilling multiple tasks without task-specific training.

Abstract

We propose a framework for learning maps between probability distributions that broadly generalizes the time dynamics of flow and diffusion models. To enable this, we generalize stochastic interpolants by replacing the scalar time variable with vectors, matrices, or linear operators, allowing us to bridge probability distributions across multiple dimensional spaces. This approach enables the construction of versatile generative models capable of fulfilling multiple tasks without task-specific training. Our operator-based interpolants not only provide a unifying theoretical perspective for existing generative models but also extend their capabilities. Through numerical experiments, we demonstrate the zero-shot efficacy of our method on conditional generation and inpainting, fine-tuning and posterior sampling, and multiscale modeling, suggesting its potential as a generic task-agnostic alternative to specialized models.

Paper Structure

This paper contains 31 sections, 14 theorems, 76 equations, 8 figures, 2 tables, 2 algorithms.

Key Result

Lemma 2.0

Let $\nu(d\alpha,d\beta)$ be a probability distribution whose support is $S$. Then the drifts $\eta_{0,1}(\alpha,\beta,x)$ in Definition def:drift can be characterized globally for all $(\alpha,\beta) \in S$ and all $x\in \mathop{\rm supp}(\mu_{\alpha,\beta})$ via solution of the optimization proble where $\|\cdot\|$ denotes the norm in $\mathcal{H}$.

Figures (8)

  • Figure 1: Multi-task, self-supervised sampling: Schematic representation of various sub-tasks that are captured by the minimizer of our learning objective using the Hadamard-product interpolant in \ref{['eq:stoch:interp:hada']}. A generative task is chosen in a zero-shot manner by specifying $\alpha$ as a function of time after training. This $\alpha_t$ serves as a continual self-supervision of what has been unmasked vs. what remains. Top: $\alpha_t$ is chosen to generate pixels in an autoregressive fashion. Middle: $\alpha_t$ is chosen to sample along a fractal morton order. Bottom: $\alpha_t$ can be chosen to do zero-shot inpainting.
  • Figure 2: Multichannel denoising: Possible interpolations fulfilled by various choices of operators in \ref{['eq:ex:op']}. We present two such examples in the form of Gaussian and motion blurring, realized by interpolations defined in the Fourier domain.
  • Figure 3: Left: In-painting on MNIST using various corruptions. Right: Image generation in arbitrary orders, starting from the same initial noise, with examples showing autoregressive, block-wise, and column-wise.
  • Figure 4: Inpainting using various masks left panels: AFHQ-Cat ($256\times256$). Right panels: CelebA ($128\times128$). Fixing block and random corruptions are scored against related works in Table \ref{['tab:inpainting_results']}, showing competitive or superior performance in all metrics.
  • Figure 5: Simulating a lattice $\phi^4$ theory.Top left: $L=32 \times L = 32$ lattice configurations at the phase transition. Bottom left: lattice examples with drift parameter $h=0.02$. Top middle: Generated lattice examples at phase transition. Bottom middle: generated lattice examples with field $h=0.02$. Right: magnetization of 2000 lattice configurations.
  • ...and 3 more figures

Theorems & Definitions (27)

  • Definition 2.0: Operator-based interpolants
  • Definition 2.0: Multipurpose drift
  • Lemma 2.0: Drift objective
  • Lemma 2.0: Score
  • Proposition 2.0: Probability flow
  • Proposition 2.0: Diffusion
  • Proposition 3.0
  • Definition A.0: Multipurpose drift
  • Lemma A.0: Drift objective
  • proof
  • ...and 17 more