Generative Dataset Distillation Based on Self-knowledge Distillation

Longzhen Li; Guang Li; Ren Togo; Keisuke Maeda; Takahiro Ogawa; Miki Haseyama

Generative Dataset Distillation Based on Self-knowledge Distillation

Longzhen Li, Guang Li, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama

TL;DR

This work tackles the high data and computational costs of training large models by introducing a generative dataset distillation framework that leverages self-knowledge distillation and distribution matching. A logits standardization step is applied prior to distribution matching to ensure consistent logit ranges, improving the fidelity of the synthetic data. Across MNIST, Fashion-MNIST, and CIFAR-10, the approach outperforms state-of-the-art distillation methods and demonstrates strong cross-architecture generalization. The findings suggest that combining GAN-based generation with robust logit-based distribution alignment yields more efficient and portable distillation for diverse architectures.

Abstract

Dataset distillation is an effective technique for reducing the cost and complexity of model training while maintaining performance by compressing large datasets into smaller, more efficient versions. In this paper, we present a novel generative dataset distillation method that can improve the accuracy of aligning prediction logits. Our approach integrates self-knowledge distillation to achieve more precise distribution matching between the synthetic and original data, thereby capturing the overall structure and relationships within the data. To further improve the accuracy of alignment, we introduce a standardization step on the logits before performing distribution matching, ensuring consistency in the range of logits. Through extensive experiments, we demonstrate that our method outperforms existing state-of-the-art methods, resulting in superior distillation performance.

Generative Dataset Distillation Based on Self-knowledge Distillation

TL;DR

Abstract

Paper Structure (9 sections, 7 equations, 2 figures, 2 tables)

This paper contains 9 sections, 7 equations, 2 figures, 2 tables.

Introduction
Generative Dataset Distillation based on Self-knowledge Distillation
GAN Generator Training
Dataset Distillation via Self-knowledge Distillation
Experiments
Datasets and Comparative Methods
Benchmark Results
Cross-architecture Results
Conclusion

Figures (2)

Figure 1: The distillation process of the proposed method. It involves the generator $G$ creating a synthetic dataset $S$. Both the original dataset $O$ and the synthetic dataset $S$ are then fed into a randomly selected model. The logits are standardized and their distributions are matched.
Figure 2: Synthetic MNIST, Fashion MNIST, and CIFAR-10 datasets using IPC = 10.

Generative Dataset Distillation Based on Self-knowledge Distillation

TL;DR

Abstract

Generative Dataset Distillation Based on Self-knowledge Distillation

Authors

TL;DR

Abstract

Table of Contents

Figures (2)