Table of Contents
Fetching ...

Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory

Mingzhuo Li, Guang Li, Jiafeng Mao, Takahiro Ogawa, Miki Haseyama

TL;DR

This work tackles insufficient diversity in generative dataset distillation by integrating diffusion models with a self-adaptive memory mechanism. The approach introduces two memory banks and minimax diffusion objectives to jointly promote representativeness of the distilled data with respect to the real data and to encourage diverse coverage of the distribution. Key contributions include a latent-diffusion distillation pipeline, two memory-driven loss terms, and a self-adaptive update rule that preserves memory diversity during training. Extensive experiments on ImageWoof, ImageNette, and ImageIDC demonstrate strong downstream performance gains over state-of-the-art methods, especially at low IPC, highlighting its practical impact for accelerating and improving distillation workflows in realistic settings.

Abstract

Dataset distillation enables the training of deep neural networks with comparable performance in significantly reduced time by compressing large datasets into small and representative ones. Although the introduction of generative models has made great achievements in this field, the distributions of their distilled datasets are not diverse enough to represent the original ones, leading to a decrease in downstream validation accuracy. In this paper, we present a diversity-driven generative dataset distillation method based on a diffusion model to solve this problem. We introduce self-adaptive memory to align the distribution between distilled and real datasets, assessing the representativeness. The degree of alignment leads the diffusion model to generate more diverse datasets during the distillation process. Extensive experiments show that our method outperforms existing state-of-the-art methods in most situations, proving its ability to tackle dataset distillation tasks.

Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory

TL;DR

This work tackles insufficient diversity in generative dataset distillation by integrating diffusion models with a self-adaptive memory mechanism. The approach introduces two memory banks and minimax diffusion objectives to jointly promote representativeness of the distilled data with respect to the real data and to encourage diverse coverage of the distribution. Key contributions include a latent-diffusion distillation pipeline, two memory-driven loss terms, and a self-adaptive update rule that preserves memory diversity during training. Extensive experiments on ImageWoof, ImageNette, and ImageIDC demonstrate strong downstream performance gains over state-of-the-art methods, especially at low IPC, highlighting its practical impact for accelerating and improving distillation workflows in realistic settings.

Abstract

Dataset distillation enables the training of deep neural networks with comparable performance in significantly reduced time by compressing large datasets into small and representative ones. Although the introduction of generative models has made great achievements in this field, the distributions of their distilled datasets are not diverse enough to represent the original ones, leading to a decrease in downstream validation accuracy. In this paper, we present a diversity-driven generative dataset distillation method based on a diffusion model to solve this problem. We introduce self-adaptive memory to align the distribution between distilled and real datasets, assessing the representativeness. The degree of alignment leads the diffusion model to generate more diverse datasets during the distillation process. Extensive experiments show that our method outperforms existing state-of-the-art methods in most situations, proving its ability to tackle dataset distillation tasks.

Paper Structure

This paper contains 11 sections, 8 equations, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: A demonstration of the gradient field during the distillation. Blue areas stand for the original distribution. Arrows follow the direction of gradient descent. Black lines show the distillation process with black dots as random noise and red dots as distilled images. The proposed method covers more areas, showing better diversity.
  • Figure 2: The distillation process of our method. Randomly selected images are input to the diffusion model to obtain the generated latents and then the diffusion loss $\mathcal{L}_\text{diffusion}$. Two memory sets consisting of real and generated latents, respectively, assist in calculating diversity loss $\mathcal{L}_\text{real}$ and $\mathcal{L}_\text{gen}$. The memory takes self-adaptive updates based on the similarity vector after each epoch.
  • Figure 3: Comparison of visualization results between original images and distilled images of DiT, Minimax, and our method with IPC = 50.
  • Figure 4: Analysis on hyperparameters of weight of real loss $\lambda_r$, weight of generative loss $\lambda_g$, and memory size $N_R$ and $N_G$. The results are obtained with ResNetAP-10 on ImageWoof with different IPC settings. Each setting is conducted 3 times, with points showing the average values and shadows covering the areas between minimum and maximum accuracy. The performance of the settings adopted in the primary experiments is marked with the dashed lines.