D$^4$M: Dataset Distillation via Disentangled Diffusion Model
Duo Su, Junjie Hou, Weizhi Gao, Yingjie Tian, Bowen Tang
TL;DR
D$^4$M introduces a cross-architecture, diffusion-model–driven dataset distillation framework that replaces synthesis-time matching with Training-Time Matching and prototype-guided latent diffusion. By extracting category prototypes via clustering in a latent space and conditioning a Latent Diffusion Model on these prototypes and text prompts, D$^4$M generates high-resolution, realistic synthetic data without architecture-specific matching. Soft-label Training-Time Matching further enhances generalization across architectures, enabling scalable distillation on ImageNet-1K and other large-scale datasets with improved efficiency. The approach delivers state-of-the-art or competitive results across benchmarks, while reducing computational costs and enabling stable cross-architecture transfer for distilled datasets.
Abstract
Dataset distillation offers a lightweight synthetic dataset for fast network training with promising test accuracy. To imitate the performance of the original dataset, most approaches employ bi-level optimization and the distillation space relies on the matching architecture. Nevertheless, these approaches either suffer significant computational costs on large-scale datasets or experience performance decline on cross-architectures. We advocate for designing an economical dataset distillation framework that is independent of the matching architectures. With empirical observations, we argue that constraining the consistency of the real and synthetic image spaces will enhance the cross-architecture generalization. Motivated by this, we introduce Dataset Distillation via Disentangled Diffusion Model (D$^4$M), an efficient framework for dataset distillation. Compared to architecture-dependent methods, D$^4$M employs latent diffusion model to guarantee consistency and incorporates label information into category prototypes. The distilled datasets are versatile, eliminating the need for repeated generation of distinct datasets for various architectures. Through comprehensive experiments, D$^4$M demonstrates superior performance and robust generalization, surpassing the SOTA methods across most aspects.
