Table of Contents
Fetching ...

Efficient Dataset Distillation via Minimax Diffusion

Jianyang Gu, Saeed Vahidian, Vyacheslav Kungurtsev, Haonan Wang, Wei Jiang, Yang You, Yiran Chen

TL;DR

This work designs additional minimax criteria in the generative training to enhance these facets for the generated images of diffusion models and presents a theoretical model of the process as hierarchical diffusion control demonstrating the flexibility of the diffusion process to target these criteria without jeopardizing the faithfulness of the sample to the desired distribution.

Abstract

Dataset distillation reduces the storage and computational consumption of training a network by generating a small surrogate dataset that encapsulates rich information of the original large-scale one. However, previous distillation methods heavily rely on the sample-wise iterative optimization scheme. As the images-per-class (IPC) setting or image resolution grows larger, the necessary computation will demand overwhelming time and resources. In this work, we intend to incorporate generative diffusion techniques for computing the surrogate dataset. Observing that key factors for constructing an effective surrogate dataset are representativeness and diversity, we design additional minimax criteria in the generative training to enhance these facets for the generated images of diffusion models. We present a theoretical model of the process as hierarchical diffusion control demonstrating the flexibility of the diffusion process to target these criteria without jeopardizing the faithfulness of the sample to the desired distribution. The proposed method achieves state-of-the-art validation performance while demanding much less computational resources. Under the 100-IPC setting on ImageWoof, our method requires less than one-twentieth the distillation time of previous methods, yet yields even better performance. Source code and generated data are available in https://github.com/vimar-gu/MinimaxDiffusion.

Efficient Dataset Distillation via Minimax Diffusion

TL;DR

This work designs additional minimax criteria in the generative training to enhance these facets for the generated images of diffusion models and presents a theoretical model of the process as hierarchical diffusion control demonstrating the flexibility of the diffusion process to target these criteria without jeopardizing the faithfulness of the sample to the desired distribution.

Abstract

Dataset distillation reduces the storage and computational consumption of training a network by generating a small surrogate dataset that encapsulates rich information of the original large-scale one. However, previous distillation methods heavily rely on the sample-wise iterative optimization scheme. As the images-per-class (IPC) setting or image resolution grows larger, the necessary computation will demand overwhelming time and resources. In this work, we intend to incorporate generative diffusion techniques for computing the surrogate dataset. Observing that key factors for constructing an effective surrogate dataset are representativeness and diversity, we design additional minimax criteria in the generative training to enhance these facets for the generated images of diffusion models. We present a theoretical model of the process as hierarchical diffusion control demonstrating the flexibility of the diffusion process to target these criteria without jeopardizing the faithfulness of the sample to the desired distribution. The proposed method achieves state-of-the-art validation performance while demanding much less computational resources. Under the 100-IPC setting on ImageWoof, our method requires less than one-twentieth the distillation time of previous methods, yet yields even better performance. Source code and generated data are available in https://github.com/vimar-gu/MinimaxDiffusion.
Paper Structure (33 sections, 11 equations, 20 figures, 10 tables, 1 algorithm)

This paper contains 33 sections, 11 equations, 20 figures, 10 tables, 1 algorithm.

Figures (20)

  • Figure 1: The validation accuracy and distillation time of different methods on ImageWoof imagenette, with a number following each method denoting the Image-Per-Class (IPC) setting. Previous methods are restricted by the heavier running time and memory consumption as IPC grows larger. In comparison, our proposed method notably reduces the demanding computational resources and also achieves state-of-the-art validation performance.
  • Figure 2: Sample images distilled by the pixel-level sample-wise optimization method DM zhao2023dataset on ImageWoof. As the parameter space increases along with the Image-Per-Class (IPC) setting, with the same initialization, the appearance disparity between original and distilled images is smaller.
  • Figure 3: The feature distribution comparison of different image generation methods with the original set. The validation performance of each surrogate set is listed in the upper-right corner.
  • Figure 4: With the help of the minimax diffusion, the proposed method significantly enhances the representativeness and diversity of the generated images. Thereby it consistently provides superior performance compared with random selection and baseline diffusion models by a large margin across different IPC settings.
  • Figure 5: Visualization of random original images, images generated by baseline diffusion models (DiT peebles2023scalable and Difffit xie2023difffit) and our proposed method. For each column, the generated images are based on the same random seed. Comparatively, our method significantly enhances the coverage of original data distribution and the diversity of the surrogate dataset.
  • ...and 15 more figures