Table of Contents
Fetching ...

Relational Diffusion Distillation for Efficient Image Generation

Weilun Feng, Chuanguang Yang, Zhulin An, Libo Huang, Boyu Diao, Fei Wang, Yongjun Xu

TL;DR

Relational Diffusion Distillation (RDD) targets the bottleneck of slow diffusion-model inference by introducing diffusion-specific distillation that leverages cross-sample relational knowledge. It adds two relational losses—IS_P2P (intra-sample pixel-to-pixel relationships) and M_P2P (memory-based pixel-to-pixel relationships)—to the standard CFD objective, and uses an online pixel queue to diversify interactions across samples. Empirically, RDD delivers substantial gains over prior distillation methods (e.g., RCFD and PD) on CIFAR-10 and ImageNet 64x64, achieving up to 256× speed-ups relative to DDIM while maintaining or improving image quality at very low sampling steps. The approach enables efficient, high-quality diffusion-based image generation suitable for edge devices and resource-constrained settings.

Abstract

Although the diffusion model has achieved remarkable performance in the field of image generation, its high inference delay hinders its wide application in edge devices with scarce computing resources. Therefore, many training-free sampling methods have been proposed to reduce the number of sampling steps required for diffusion models. However, they perform poorly under a very small number of sampling steps. Thanks to the emergence of knowledge distillation technology, the existing training scheme methods have achieved excellent results at very low step numbers. However, the current methods mainly focus on designing novel diffusion model sampling methods with knowledge distillation. How to transfer better diffusion knowledge from teacher models is a more valuable problem but rarely studied. Therefore, we propose Relational Diffusion Distillation (RDD), a novel distillation method tailored specifically for distilling diffusion models. Unlike existing methods that simply align teacher and student models at pixel level or feature distributions, our method introduces cross-sample relationship interaction during the distillation process and alleviates the memory constraints induced by multiple sample interactions. Our RDD significantly enhances the effectiveness of the progressive distillation framework within the diffusion model. Extensive experiments on several datasets (e.g., CIFAR-10 and ImageNet) demonstrate that our proposed RDD leads to 1.47 FID decrease under 1 sampling step compared to state-of-the-art diffusion distillation methods and achieving 256x speed-up compared to DDIM strategy. Code is available at https://github.com/cantbebetter2/RDD.

Relational Diffusion Distillation for Efficient Image Generation

TL;DR

Relational Diffusion Distillation (RDD) targets the bottleneck of slow diffusion-model inference by introducing diffusion-specific distillation that leverages cross-sample relational knowledge. It adds two relational losses—IS_P2P (intra-sample pixel-to-pixel relationships) and M_P2P (memory-based pixel-to-pixel relationships)—to the standard CFD objective, and uses an online pixel queue to diversify interactions across samples. Empirically, RDD delivers substantial gains over prior distillation methods (e.g., RCFD and PD) on CIFAR-10 and ImageNet 64x64, achieving up to 256× speed-ups relative to DDIM while maintaining or improving image quality at very low sampling steps. The approach enables efficient, high-quality diffusion-based image generation suitable for edge devices and resource-constrained settings.

Abstract

Although the diffusion model has achieved remarkable performance in the field of image generation, its high inference delay hinders its wide application in edge devices with scarce computing resources. Therefore, many training-free sampling methods have been proposed to reduce the number of sampling steps required for diffusion models. However, they perform poorly under a very small number of sampling steps. Thanks to the emergence of knowledge distillation technology, the existing training scheme methods have achieved excellent results at very low step numbers. However, the current methods mainly focus on designing novel diffusion model sampling methods with knowledge distillation. How to transfer better diffusion knowledge from teacher models is a more valuable problem but rarely studied. Therefore, we propose Relational Diffusion Distillation (RDD), a novel distillation method tailored specifically for distilling diffusion models. Unlike existing methods that simply align teacher and student models at pixel level or feature distributions, our method introduces cross-sample relationship interaction during the distillation process and alleviates the memory constraints induced by multiple sample interactions. Our RDD significantly enhances the effectiveness of the progressive distillation framework within the diffusion model. Extensive experiments on several datasets (e.g., CIFAR-10 and ImageNet) demonstrate that our proposed RDD leads to 1.47 FID decrease under 1 sampling step compared to state-of-the-art diffusion distillation methods and achieving 256x speed-up compared to DDIM strategy. Code is available at https://github.com/cantbebetter2/RDD.

Paper Structure

This paper contains 17 sections, 10 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Different distillation targets between (a) PD, (b) RCFD, and (c) our proposed RDD.
  • Figure 2: Difference between Intra-image and Intra-sample pixel-to-pixel distillation.
  • Figure 3: Overview of Intra-Sample Pixel-to-Pixel Relationship Distillation.
  • Figure 4: Overview of Memory-based Pixel-to-Pixel Relationship Distillation
  • Figure 5: Samples generated in one step by (a) PD, (b) RCFD, and (c) our proposed RDD on ImageNet 64$\times$64. All corresponding images are generated from the same initial noise.
  • ...and 4 more figures