Generative Dataset Distillation Based on Self-knowledge Distillation
Longzhen Li, Guang Li, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama
TL;DR
This work tackles the high data and computational costs of training large models by introducing a generative dataset distillation framework that leverages self-knowledge distillation and distribution matching. A logits standardization step is applied prior to distribution matching to ensure consistent logit ranges, improving the fidelity of the synthetic data. Across MNIST, Fashion-MNIST, and CIFAR-10, the approach outperforms state-of-the-art distillation methods and demonstrates strong cross-architecture generalization. The findings suggest that combining GAN-based generation with robust logit-based distribution alignment yields more efficient and portable distillation for diverse architectures.
Abstract
Dataset distillation is an effective technique for reducing the cost and complexity of model training while maintaining performance by compressing large datasets into smaller, more efficient versions. In this paper, we present a novel generative dataset distillation method that can improve the accuracy of aligning prediction logits. Our approach integrates self-knowledge distillation to achieve more precise distribution matching between the synthetic and original data, thereby capturing the overall structure and relationships within the data. To further improve the accuracy of alignment, we introduce a standardization step on the logits before performing distribution matching, ensuring consistency in the range of logits. Through extensive experiments, we demonstrate that our method outperforms existing state-of-the-art methods, resulting in superior distillation performance.
