STORK: Faster Diffusion And Flow Matching Sampling By Resolving Both Stiffness And Structure-Dependence

Zheng Tan; Weizhen Wang; Andrea L. Bertozzi; Ernest K. Ryu

STORK: Faster Diffusion And Flow Matching Sampling By Resolving Both Stiffness And Structure-Dependence

Zheng Tan, Weizhen Wang, Andrea L. Bertozzi, Ernest K. Ryu

TL;DR

This work tackles the bottleneck of slow sampling in diffusion and flow-matching models by addressing stiffness and structure-dependence in a single training-free solver. The authors introduce Stabilized Taylor Orthogonal Runge--Kutta (STORK), a stiff, structure-independent SRK-based method that uses Taylor-expanded virtual NFEs to achieve high-order accuracy with reduced NFEs for both noise-based and flow-based ODEs. Empirical results across image and video generation show that STORK outperforms state-of-the-art training-free samplers (e.g., DPM-Solver++, UniPC) on unconditional and conditional tasks at low NFEs, including challenging video generation scenarios. The approach promises practical impact by enabling faster, high-fidelity sampling for large diffusion and flow-matching models without additional training or model modifications, with broad applicability to real-time and resource-constrained generation settings.

Abstract

Diffusion models (DMs) and flow-matching models have demonstrated remarkable performance in image and video generation. However, such models require a significant number of function evaluations (NFEs) during sampling, leading to costly inference. Consequently, quality-preserving fast sampling methods that require fewer NFEs have been an active area of research. However, prior training-free sampling methods fail to simultaneously address two key challenges: the stiffness of the ODE (i.e., the non-straightness of the velocity field) and dependence on the semi-linear structure of the DM ODE (which limits their direct applicability to flow-matching models). In this work, we introduce the Stabilized Taylor Orthogonal Runge--Kutta (STORK) method, addressing both design concerns. We demonstrate that STORK consistently improves the quality of diffusion and flow-matching sampling for image and video generation. Code is available at https://github.com/ZT220501/STORK.

STORK: Faster Diffusion And Flow Matching Sampling By Resolving Both Stiffness And Structure-Dependence

TL;DR

Abstract

STORK: Faster Diffusion And Flow Matching Sampling By Resolving Both Stiffness And Structure-Dependence

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (3)