Table of Contents
Fetching ...

A-FloPS: Accelerating Diffusion Models via Adaptive Flow Path Sampler

Cheng Jin, Zhenyu Xiao, Yuantao Gu

TL;DR

A-FloPS tackles the latency of diffusion-model sampling by introducing a training-free, trajectory-level reparameterization that converts any pre-trained diffusion model into a flow-matching path. The key advance is an adaptive velocity decomposition that splits the FM velocity into a linear drift and a smooth residual, enabling effective high-order ODE integration even in very few steps. The combination—FloPS for diffusion-to-flow transformation plus A-FloPS for adaptivity—yields state-of-the-art results among training-free samplers on conditional image and text-to-image tasks, with notable gains at $NFE=5$. The approach is general, applying to native FM generators as well, and demonstrates practical promise for low-latency, high-quality generative modeling across modalities.

Abstract

Diffusion models deliver state-of-the-art generative performance across diverse modalities but remain computationally expensive due to their inherently iterative sampling process. Existing training-free acceleration methods typically improve numerical solvers for the reverse-time ODE, yet their effectiveness is fundamentally constrained by the inefficiency of the underlying sampling trajectories. We propose A-FloPS (Adaptive Flow Path Sampler), a principled, training-free framework that reparameterizes the sampling trajectory of any pre-trained diffusion model into a flow-matching form and augments it with an adaptive velocity decomposition. The reparameterization analytically maps diffusion scores to flow-compatible velocities, yielding integration-friendly trajectories without retraining. The adaptive mechanism further factorizes the velocity field into a linear drift term and a residual component whose temporal variation is actively suppressed, restoring the accuracy benefits of high-order integration even in extremely low-NFE regimes. Extensive experiments on conditional image generation and text-to-image synthesis show that A-FloPS consistently outperforms state-of-the-art training-free samplers in both sample quality and efficiency. Notably, with as few as $5$ function evaluations, A-FloPS achieves substantially lower FID and generates sharper, more coherent images. The adaptive mechanism also improves native flow-based generative models, underscoring its generality. These results position A-FloPS as a versatile and effective solution for high-quality, low-latency generative modeling.

A-FloPS: Accelerating Diffusion Models via Adaptive Flow Path Sampler

TL;DR

A-FloPS tackles the latency of diffusion-model sampling by introducing a training-free, trajectory-level reparameterization that converts any pre-trained diffusion model into a flow-matching path. The key advance is an adaptive velocity decomposition that splits the FM velocity into a linear drift and a smooth residual, enabling effective high-order ODE integration even in very few steps. The combination—FloPS for diffusion-to-flow transformation plus A-FloPS for adaptivity—yields state-of-the-art results among training-free samplers on conditional image and text-to-image tasks, with notable gains at . The approach is general, applying to native FM generators as well, and demonstrates practical promise for low-latency, high-quality generative modeling across modalities.

Abstract

Diffusion models deliver state-of-the-art generative performance across diverse modalities but remain computationally expensive due to their inherently iterative sampling process. Existing training-free acceleration methods typically improve numerical solvers for the reverse-time ODE, yet their effectiveness is fundamentally constrained by the inefficiency of the underlying sampling trajectories. We propose A-FloPS (Adaptive Flow Path Sampler), a principled, training-free framework that reparameterizes the sampling trajectory of any pre-trained diffusion model into a flow-matching form and augments it with an adaptive velocity decomposition. The reparameterization analytically maps diffusion scores to flow-compatible velocities, yielding integration-friendly trajectories without retraining. The adaptive mechanism further factorizes the velocity field into a linear drift term and a residual component whose temporal variation is actively suppressed, restoring the accuracy benefits of high-order integration even in extremely low-NFE regimes. Extensive experiments on conditional image generation and text-to-image synthesis show that A-FloPS consistently outperforms state-of-the-art training-free samplers in both sample quality and efficiency. Notably, with as few as function evaluations, A-FloPS achieves substantially lower FID and generates sharper, more coherent images. The adaptive mechanism also improves native flow-based generative models, underscoring its generality. These results position A-FloPS as a versatile and effective solution for high-quality, low-latency generative modeling.

Paper Structure

This paper contains 29 sections, 1 theorem, 30 equations, 5 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

If a diffusion model and a flow‑matching model target the same distribution, i.e., $q_1$ in Eq. eq:fm-training equals $p_0$ in Eq. eq:forwardDM_special, their velocity field and score function satisfy: where $t = \frac{1}{1 + \sigma_\tau / \bar{\alpha}_\tau}$ and this transformation is bijective since $\sigma_\tau / \bar{\alpha}_\tau$ increases strictly with $\tau$, ensuring a one‑to‑one correspo

Figures (5)

  • Figure 1: Qualitative comparison of generated images for the "golden retriever" class using different diffusion samplers across varying inference steps. Each row corresponds to a specific sampler (from top to bottom: DDIM, DPM-Solver++, UniPC, FloPS, and A-FloPS), while each column shows results at a different number of function evaluations (NFE = 5, 6, 7, 8, 9, 25). All methods are initialized with the same random seed to ensure a fair comparison. Our proposed A-FloPS generates sharper textures and more coherent object structures, achieving superior visual fidelity with substantially fewer sampling steps.
  • Figure 2: Qualitative comparison between A-Euler and baseline Euler on SDv3.5 (d=38) at $\text{NFE}=5$. Prompt: "A cat is sitting on top of a bed in front of a living room". A-Euler produces sharper textures and more coherent structures.
  • Figure 3: Qualitative comparison of generated samples from different samplers on the DIT model with $\text{NFE}=5$. Our proposed A-FloPS (d) produces images with clearer structures, richer details compared to DDIM (a), DPM-Solver++ (b), and UniPC (c).
  • Figure 4: Prompt: A man in a suit smiling at the camera
  • Figure 7: Prompt: a large bowl full of pasta with many other foods

Theorems & Definitions (1)

  • Theorem 1: Diffusion‑to‑Flow Transformation