Practical Dataset Distillation Based on Deep Support Vectors

Hyunho Lee; Junhoo Lee; Nojun Kwak

Practical Dataset Distillation Based on Deep Support Vectors

Hyunho Lee, Junhoo Lee, Nojun Kwak

TL;DR

A novel distillation method is introduced that augments the conventional process by incorporating general model knowledge via the addition of Deep KKT (DKKT) loss and shows improved performance compared to the baseline distribution matching distillation method on the CIFAR-10 dataset.

Abstract

Conventional dataset distillation requires significant computational resources and assumes access to the entire dataset, an assumption impractical as it presumes all data resides on a central server. In this paper, we focus on dataset distillation in practical scenarios with access to only a fraction of the entire dataset. We introduce a novel distillation method that augments the conventional process by incorporating general model knowledge via the addition of Deep KKT (DKKT) loss. In practical settings, our approach showed improved performance compared to the baseline distribution matching distillation method on the CIFAR-10 dataset. Additionally, we present experimental evidence that Deep Support Vectors (DSVs) offer unique information to the original distillation, and their integration results in enhanced performance.

Practical Dataset Distillation Based on Deep Support Vectors

TL;DR

Abstract

Paper Structure (21 sections, 5 equations, 3 figures, 1 table)

This paper contains 21 sections, 5 equations, 3 figures, 1 table.

Introduction
Related Work
Dataset Distillation.
Model Inversion.
Method
Notation.
Preliminaries
Deep Support Vectors.
Distribution Matching.
Combining knowledge to mitigate low diversity
Problem.
DM loss (Data knowledge loss).
DKKT loss (Model knowledge loss).
Practical Dataset Distillation.
Experiments
...and 6 more sections

Figures (3)

Figure 1: Dataset distillation in practical scenarios: data is gathered in edge devices, most of which is private (red), while safe data (green) that can be transferred to the central server is scarce. However a lightweight application model, continually trained on the entire dataset on the edge device, can be transferred to the server without any privacy concerns.
Figure 2: Qualitative results of our method, with 1 image per class (ipc) and 50 practically accessible images per class (pipc).
Figure 3: Results for the average of DSVs (top) and DM (bottom). The Fourier-Transformed images (FFT) indicate that the two lie in different frequency domains and the averaged sample resulted in better accuracy (25.34%).

Practical Dataset Distillation Based on Deep Support Vectors

TL;DR

Abstract

Practical Dataset Distillation Based on Deep Support Vectors

Authors

TL;DR

Abstract

Table of Contents

Figures (3)