Table of Contents
Fetching ...

Label-Augmented Dataset Distillation

Seoungyoon Kang, Youngsun Lim, Hyunjung Shim

TL;DR

This study introduces Label-Augmented Dataset Distillation (LADD), a new dataset distillation framework enhancing dataset distillation with label augmentations that achieves remarkable gains by an average of 14.9% in accuracy.

Abstract

Traditional dataset distillation primarily focuses on image representation while often overlooking the important role of labels. In this study, we introduce Label-Augmented Dataset Distillation (LADD), a new dataset distillation framework enhancing dataset distillation with label augmentations. LADD sub-samples each synthetic image, generating additional dense labels to capture rich semantics. These dense labels require only a 2.5% increase in storage (ImageNet subsets) with significant performance benefits, providing strong learning signals. Our label generation strategy can complement existing dataset distillation methods for significantly enhancing their training efficiency and performance. Experimental results demonstrate that LADD outperforms existing methods in terms of computational overhead and accuracy. With three high-performance dataset distillation algorithms, LADD achieves remarkable gains by an average of 14.9% in accuracy. Furthermore, the effectiveness of our method is proven across various datasets, distillation hyperparameters, and algorithms. Finally, our method improves the cross-architecture robustness of the distilled dataset, which is important in the application scenario.

Label-Augmented Dataset Distillation

TL;DR

This study introduces Label-Augmented Dataset Distillation (LADD), a new dataset distillation framework enhancing dataset distillation with label augmentations that achieves remarkable gains by an average of 14.9% in accuracy.

Abstract

Traditional dataset distillation primarily focuses on image representation while often overlooking the important role of labels. In this study, we introduce Label-Augmented Dataset Distillation (LADD), a new dataset distillation framework enhancing dataset distillation with label augmentations. LADD sub-samples each synthetic image, generating additional dense labels to capture rich semantics. These dense labels require only a 2.5% increase in storage (ImageNet subsets) with significant performance benefits, providing strong learning signals. Our label generation strategy can complement existing dataset distillation methods for significantly enhancing their training efficiency and performance. Experimental results demonstrate that LADD outperforms existing methods in terms of computational overhead and accuracy. With three high-performance dataset distillation algorithms, LADD achieves remarkable gains by an average of 14.9% in accuracy. Furthermore, the effectiveness of our method is proven across various datasets, distillation hyperparameters, and algorithms. Finally, our method improves the cross-architecture robustness of the distilled dataset, which is important in the application scenario.
Paper Structure (22 sections, 16 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 22 sections, 16 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: Overview of LADD. Once the distilled dataset $D$ is synthesized by baseline, LADD initiates label augmentation. It divides each image in $D$ into $N \times N$ sub-images, as illustrated in Fig. 1 ($N=3$). Then, $N^2$ soft labels are computed using the labeler $g$ to produce the dense label. Label augmented distilled dataset $D_{LA}$ consists of images, labels, and dense labels; it is utilized in the deployment stage to train the evaluation model.
  • Figure 2: FLOPs-Accuracy Plot for Distillation.$x$-axis indicates the total computational cost to obtain $D$ in FLOPs. For LADD, we compute FLOPs for both synthesizing $D$ and creating dense labels. Each result uses ImageNette.
  • Figure 3: FLOPs-Accuracy Plot at the Deployment Stage.$x$-axis indicates the total computational cost at the deployment stage in FLOPs. Among the three algorithms, LADD shows the best performance. Each result uses ImageNette at 5 IPC.
  • Figure 4: Analysis on the Dataset Quality. The second and third columns depict GradCAM ref_gradcam_selvaraju2017grad visualization of each prediction from GLaD(MTT) (baseline) and LADD-GLaD(MTT) (LADD), respectively.
  • Figure 5: Analysis on the Labeler $g$. (a) The Blue line indicates the labeler performance. The orange line depicts the accuracy of the test model in the deployment stage where dense labels in the distilled dataset are obtained from the labeler of each epoch. (b) Each bar graph depicts the prediction probability of the example image using the labeler for each epoch.
  • ...and 1 more figures