FRDiff : Feature Reuse for Universal Training-free Acceleration of Diffusion Models
Junhyuk So, Jungwon Lee, Eunhyeok Park
TL;DR
FRDiff introduces Feature Reuse (FR) to exploit temporal redundancy in diffusion models, enabling training-free acceleration by reusing intermediate feature maps at selected keyframes. It couples FR with Score Mixing to blend low-frequency structure from reduced NFE with high-frequency detail preserved by FR, and adds Auto-FR to automatically tune the reuse policy without model fine-tuning. Through extensive experiments across SD, SDXL, LDM, and DiT, FRDiff achieves up to 1.76x speedups while maintaining or improving perceptual fidelity (FID) and task versatility (super-resolution, inpainting, image-to-video). The approach offers a practical, plug-and-play pathway to faster diffusion-based generation without the cost of retraining or heavy engineering.
Abstract
The substantial computational costs of diffusion models, especially due to the repeated denoising steps necessary for high-quality image generation, present a major obstacle to their widespread adoption. While several studies have attempted to address this issue by reducing the number of score function evaluations (NFE) using advanced ODE solvers without fine-tuning, the decreased number of denoising iterations misses the opportunity to update fine details, resulting in noticeable quality degradation. In our work, we introduce an advanced acceleration technique that leverages the temporal redundancy inherent in diffusion models. Reusing feature maps with high temporal similarity opens up a new opportunity to save computation resources without compromising output quality. To realize the practical benefits of this intuition, we conduct an extensive analysis and propose a novel method, FRDiff. FRDiff is designed to harness the advantages of both reduced NFE and feature reuse, achieving a Pareto frontier that balances fidelity and latency trade-offs in various generative tasks.
