Unfolding Generative Flows with Koopman Operators: Fast and Interpretable Sampling
Erkan Turan, Aristotelis Siozopoulos, Louis Martinez, Julien Gaubil, Emery Pierson, Maks Ovsjanikov
TL;DR
This work addresses the slow sampling and limited interpretability of Continuous Normalizing Flows by learning a global Koopman representation of the pre-trained Conditional Flow Matching dynamics. By augmenting the dynamics to a time-augmented autonomous system and training an encoder, generator, and decoder, the method yields a linear latent space whose evolution is governed by a finite-dimensional generator $L$, enabling one-step sampling via the matrix exponential $e^{L}$ and enabling spectral interpretability through Koopman modes. A simulation-free training objective enforces infinitesimal consistency with the teacher's vector field along full trajectories, yielding faithful dynamic behavior and robust interpretability beyond boundary-only methods. Empirically, the approach achieves competitive sample quality with dramatic speedups, and the learned Koopman structure supports mode-based editing and transfer to pixel-space dynamics, highlighting practical benefits for controllable and interpretable generative modeling.
Abstract
Continuous Normalizing Flows (CNFs) enable elegant generative modeling but remain bottlenecked by slow sampling: producing a single sample requires solving a nonlinear ODE with hundreds of function evaluations. Recent approaches such as Rectified Flow and OT-CFM accelerate sampling by straightening trajectories, yet the learned dynamics remain nonlinear black boxes, limiting both efficiency and interpretability. We propose a fundamentally different perspective: globally linearizing flow dynamics via Koopman theory. By lifting Conditional Flow Matching (CFM) into a higher-dimensional Koopman space, we represent its evolution with a single linear operator. This yields two key benefits. First, sampling becomes one-step and parallelizable, computed in closed form via the matrix exponential. Second, the Koopman operator provides a spectral blueprint of generation, enabling novel interpretability through its eigenvalues and modes. We derive a practical, simulation-free training objective that enforces infinitesimal consistency with the teacher's dynamics and show that this alignment preserves fidelity along the full generative path, distinguishing our method from boundary-only distillation. Empirically, our approach achieves competitive sample quality with dramatic speedups, while uniquely enabling spectral analysis of generative flows.
