Training-Free Generative Modeling via Kernelized Stochastic Interpolants
Florentin Coeurdoux, Etienne Lempereur, Nathanaël Cuvelle-Magar, Thomas Eboli, Stéphane Mallat, Anastasia Borovykh, Eric Vanden-Eijnden
TL;DR
The paper introduces training-free generative modeling by kernelizing stochastic interpolants: a finite-dimensional drift regression $\hat b_t(x)=\nabla\phi(x)^\top\eta_t$ is learned via a $P\times P$ linear system, where $P$ is independent of data dimension $d$. The diffusion schedule is optimally chosen as $D_t^* = \alpha_t\gamma_t/\beta_t$ to minimize a path KL bound, yielding a drift that effectively preserves transport while controlling estimation error. An integrator handles endpoint divergences ($D_0^* = \infty$, $D_1^* = 0$), and the framework accommodates diverse feature maps, including scattering spectra and pretrained velocity fields, enabling training-free generation and cross-model combination. Applications span financial time series, turbulence, and high-resolution image generation, with ensemble demonstrations showing that combining weak models via the linear system can surpass individual weak learners. This approach offers a scalable, training-free path to powerful generative modeling and model fusion, complementary to moment-guided diffusion methods.
Abstract
We develop a kernel method for generative modeling within the stochastic interpolant framework, replacing neural network training with linear systems. The drift of the generative SDE is $\hat b_t(x) = \nablaφ(x)^\topη_t$, where $η_t\in\R^P$ solves a $P\times P$ system computable from data, with $P$ independent of the data dimension $d$. Since estimates are inexact, the diffusion coefficient $D_t$ affects sample quality; the optimal $D_t^*$ from Girsanov diverges at $t=0$, but this poses no difficulty and we develop an integrator that handles it seamlessly. The framework accommodates diverse feature maps -- scattering transforms, pretrained generative models etc. -- enabling training-free generation and model combination. We demonstrate the approach on financial time series, turbulence, and image generation.
