Diversity-Driven Synthesis: Enhancing Dataset Distillation through Directed Weight Adjustment

Jiawei Du; Xin Zhang; Juncheng Hu; Wenxin Huang; Joey Tianyi Zhou

Diversity-Driven Synthesis: Enhancing Dataset Distillation through Directed Weight Adjustment

Jiawei Du, Xin Zhang, Juncheng Hu, Wenxin Huang, Joey Tianyi Zhou

TL;DR

This paper introduces a novel method that employs dynamic and directed weight adjustment techniques to modulate the synthesis process, thereby maximizing the representativeness and diversity of each synthetic instance.

Abstract

The sharp increase in data-related expenses has motivated research into condensing datasets while retaining the most informative features. Dataset distillation has thus recently come to the fore. This paradigm generates synthetic datasets that are representative enough to replace the original dataset in training a neural network. To avoid redundancy in these synthetic datasets, it is crucial that each element contains unique features and remains diverse from others during the synthesis stage. In this paper, we provide a thorough theoretical and empirical analysis of diversity within synthesized datasets. We argue that enhancing diversity can improve the parallelizable yet isolated synthesizing approach. Specifically, we introduce a novel method that employs dynamic and directed weight adjustment techniques to modulate the synthesis process, thereby maximizing the representativeness and diversity of each synthetic instance. Our method ensures that each batch of synthetic data mirrors the characteristics of a large, varying subset of the original dataset. Extensive experiments across multiple datasets, including CIFAR, Tiny-ImageNet, and ImageNet-1K, demonstrate the superior performance of our method, highlighting its effectiveness in producing diverse and representative synthetic datasets with minimal computational expense. Our code is available at https://github.com/AngusDujw/Diversity-Driven-Synthesis.https://github.com/AngusDujw/Diversity-Driven-Synthesis.

Diversity-Driven Synthesis: Enhancing Dataset Distillation through Directed Weight Adjustment

TL;DR

Abstract

Paper Structure (19 sections, 22 equations, 5 figures, 11 tables, 1 algorithm)

This paper contains 19 sections, 22 equations, 5 figures, 11 tables, 1 algorithm.

Introduction
Preliminaries
Methodology
Batch Normalization Loss Enhances Diversity of ${\mathcal{S}}$
Random Perturbation on $\theta_{\mathcal{T}}$ Helps Improve Diversity
Directed Weight Adjustment on $\theta_{\mathcal{T}}$
Experiments
Results & Discussions
Ablation Study
Related Works
Conclusion
Appendix
Minimizing ${\mathcal{L}}_\mathrm{mean}$ and ${\mathcal{L}}_\mathrm{var}$ can be contradictory
Experiments
Hyper-parameter Settings
...and 4 more sections

Figures (5)

Figure 1: Left: t-SNE visualization of logit embeddings on CIFAR-100 cifar dataset. The scatter plot illustrates the distribution of synthetic data instances distilled by SRe2L (blue dots) and our DWA method (red stars). The blue density contours represent the distribution of natural data instances. Our DWA method demonstrates a more diverse and widespread distribution compared to SRe2L sre, indicating better generalization and coverage of the feature space. Right: The consequent performance improvement of DWA in various datasets. Experiments are conducted with 50 images per class.
Figure 2: Visualization of distilled images for the goldfish class. Panels (a) and (b) show the synthesized results by SRe2L sre and our DWA, respectively. The synthetic data instances generated by our DWA method exhibit significantly greater diversity compared to those produced by SRe2L, highlighting the effectiveness of our approach in capturing a broader range of features.
Figure 3: Analysis of decoupled ${\mathcal{L}}_\mathrm{var}$ coefficient. We vary $\lambda_\mathrm{var}$ across a wide range of $(0.01 \sim 0.23)$. 'decoupled var' indicates $\lambda_\mathrm{var}$ is changing individually with a fixed mean component whose weight defaults to 0.01. 'coupled var' represents the weight of the mean and $\lambda_\mathrm{var}$ change in tandem. (a) and (b) illustrate the performance of the original SRe2L sre and our DWA in these two scenarios, respectively. This analysis is conducted on CIFAR-100 using ResNet-18. Each $\lambda_\mathrm{var}$ undergoes five independent experiments, with variance indicated by lighter color shades.
Figure 4: Normalized feature distance of decoupled variance component with $\lambda_\mathrm{var} = 0.11$ (the weight of mean component defaults to $0.01$) and coupled variance component with $\lambda_{\mathrm{BN}} = 0.11$. ResNet-18's last convolutional layer outputs are used for feature distance calculation (see \ref{['app:feature_dis']}). Ten classes are randomly chosen from CIFAR-100 distilled dataset.
Figure 5: Performance grid of ResNet-18 with changes in perturbation steps $K$ and magnitude $\rho$.

Diversity-Driven Synthesis: Enhancing Dataset Distillation through Directed Weight Adjustment

TL;DR

Abstract

Diversity-Driven Synthesis: Enhancing Dataset Distillation through Directed Weight Adjustment

Authors

TL;DR

Abstract

Table of Contents

Figures (5)