Table of Contents
Fetching ...

IMS3: Breaking Distributional Aggregation in Diffusion-Based Dataset Distillation

Chenru Wang, Yunyi Chen, Zijun Yang, Joey Tianyi Zhou, Chi Zhang

Abstract

Dataset Distillation aims to synthesize compact datasets that can approximate the training efficacy of large-scale real datasets, offering an efficient solution to the increasing computational demands of modern deep learning. Recently, diffusion-based dataset distillation methods have shown great promise by leveraging the strong generative capacity of diffusion models to produce diverse and structurally consistent samples. However, a fundamental goal misalignment persists: diffusion models are optimized for generative likelihood rather than discriminative utility, resulting in over-concentration in high-density regions and inadequate coverage of boundary samples crucial for classification. To address this issue, we propose two complementary strategies. Inversion-Matching (IM) introduces an inversion-guided fine-tuning process that aligns denoising trajectories with their inversion counterparts, broadening distributional coverage and enhancing diversity. Selective Subgroup Sampling(S^3) is a training-free sampling mechanism that improves inter-class separability by selecting synthetic subsets that are both representative and distinctive. Extensive experiments demonstrate that our approach significantly enhances the discriminative quality and generalization of distilled datasets, achieving state-of-the-art performance among diffusion-based methods.

IMS3: Breaking Distributional Aggregation in Diffusion-Based Dataset Distillation

Abstract

Dataset Distillation aims to synthesize compact datasets that can approximate the training efficacy of large-scale real datasets, offering an efficient solution to the increasing computational demands of modern deep learning. Recently, diffusion-based dataset distillation methods have shown great promise by leveraging the strong generative capacity of diffusion models to produce diverse and structurally consistent samples. However, a fundamental goal misalignment persists: diffusion models are optimized for generative likelihood rather than discriminative utility, resulting in over-concentration in high-density regions and inadequate coverage of boundary samples crucial for classification. To address this issue, we propose two complementary strategies. Inversion-Matching (IM) introduces an inversion-guided fine-tuning process that aligns denoising trajectories with their inversion counterparts, broadening distributional coverage and enhancing diversity. Selective Subgroup Sampling(S^3) is a training-free sampling mechanism that improves inter-class separability by selecting synthetic subsets that are both representative and distinctive. Extensive experiments demonstrate that our approach significantly enhances the discriminative quality and generalization of distilled datasets, achieving state-of-the-art performance among diffusion-based methods.
Paper Structure (16 sections, 14 equations, 20 figures, 13 tables, 2 algorithms)

This paper contains 16 sections, 14 equations, 20 figures, 13 tables, 2 algorithms.

Figures (20)

  • Figure 1: t-SNE visualization of feature distributions for different distilled datasets. The blue density map represents the feature distribution of the original dataset. The purple squares correspond to samples generated by the DiT DiT, the orange dots correspond to Minimax minimax, and the red stars indicate samples produced by our ImS3. Our method covers a broader area of the feature space, indicating improved distribu- tional coverage.
  • Figure 2: Overview of the proposed ImS3 framework. (a) IM fine-tuning expands distribution coverage by aligning denoised and inverted latents through the IM loss $\mathcal{L}_{\mathrm{IM}}$. (b) S3 enhances discriminative diversity during sampling by selecting representative subgroups whose centroids are close to real data centroids while maintaining inter-class separation, optimized via $\mathcal{L}_{\mathrm{S}^3}$. $r_i$ denotes the centroid of real samples from the $i$-th class, while $c_{i,g_i}$ represents the centroid of the $g_i$-th candidate subgroup synthesized by the diffusion model for the same class. The selected optimal subgroup centroid $c_{i,g_i^*}$ strikes the best balance between representativeness to the real class distribution and distinctiveness from other classes.
  • Figure 3: Heatmap of classification accuracy under different combinations of $\alpha$ and $\beta$. The heatmap shows that balanced values of $\alpha$ and $\beta$ yield the best performance.
  • Figure 4: Analysis of real-images usage from original dataset used to compute reference centroids in \ref{['eq:real-centroid']}. Moderate sample counts provide stable and reliable centroids for subgroup selection, while too few samples introduce noise and degrade selection quality.
  • Figure 5: Validation accuracy under different matching strengths. Performance of our method across a range of matching coefficients $\lambda_{\mathrm{IM}}$ for IPC = 10, 20, and 50. Each curve corresponds to a fixed IPC, and shaded regions denote performance variability across runs.
  • ...and 15 more figures