Hyperparameters are all you need: Using five-step inference for an original diffusion model to generate images comparable to the latest distillation model

Zilai Li

Hyperparameters are all you need: Using five-step inference for an original diffusion model to generate images comparable to the latest distillation model

Zilai Li

TL;DR

The paper tackles the heavy computational burden of diffusion-model sampling by introducing a training-free inference plugin that utilizes truncation-error analysis and hyperparameter-tuned ODE solvers, augmented with a Free-U decorator to modify the U-Net's skip connections. By operating in a latent-diffusion framework, the method enables 5–6 step generation for high-resolution images without retraining, and reports competitive or superior FID against state-of-the-art distillation models on COCO/LAION datasets. An information-theoretic analysis and ablation studies elucidate why certain hyperparameter couplings and final-stage denoising preserve image diversity while accelerating inference. The authors provide extensive experiments across 512×512 and 1024×1024 outputs, demonstrating robust performance and offering a public codebase for reproducibility.

Abstract

The diffusion model is a state-of-the-art generative model that samples images by applying a neural network iteratively. However, the original sampling algorithm requires substantial computation cost, and reducing the sampling step is a prevailing research area. To cope with this problem, one mainstream approach is to treat the sampling process as an algorithm that solves an ordinary differential equation (ODE). Our study proposes a training-free inference plugin compatible with most few-step ODE solvers. To the best of my knowledge, our algorithm is the first training-free algorithm to sample a 1024 x 1024-resolution image in 6 steps and a 512 x 512-resolution image in 5 steps, with an FID result that outperforms the SOTA distillation models and the 20-step DPM++ 2m solver, respectively. Based on analyses of the latent diffusion model's structure, the diffusion ODE, and the Free-U mechanism, we explain why specific hyperparameter couplings improve stability and inference speed without retraining. Meanwhile, experimental results also reveal a new design space of the latent diffusion ODE solver. Additionally, we also analyze the difference between the original diffusion model and the diffusion distillation model via an information-theoretic study, which shows the reason why the few-step ODE solver designed for the diffusion model can outperform the training-based diffusion distillation algorithm in few-step inference. The tentative results of the experiment prove the mathematical analysis. code base is below: https://github.com/TheLovesOfLadyPurple/Hyperparameter-is-all-you-need

Hyperparameters are all you need: Using five-step inference for an original diffusion model to generate images comparable to the latest distillation model

TL;DR

Abstract

Hyperparameters are all you need: Using five-step inference for an original diffusion model to generate images comparable to the latest distillation model

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)