Table of Contents
Fetching ...

Condensed Data Expansion Using Model Inversion for Knowledge Distillation

Kuluhan Binici, Shivam Aggarwal, Cihan Acar, Nam Trung Pham, Karianto Leman, Gim Hee Lee, Tulika Mitra

TL;DR

The paper tackles the limited information content of condensed datasets for knowledge distillation by proposing a condensed data expansion approach guided by model inversion. It introduces a feature-alignment discriminator that conditions synthetic data on condensed prototypes, enabling synthetic samples to closely reflect the underlying data distribution and reduce domain gaps. Empirical results across CIFAR-10/100 and ImageNet-200 show consistent KD improvements, with gains up to around 11.4 percentage points, and effectiveness even with minimal per-class condensed samples or real-data in few-shot settings. The method is compatible with existing MI techniques and strengthens KD in heterogeneous model pairs, privacy-preserving contexts, and data-scarce regimes, offering practical benefits for compressed-data KD pipelines.

Abstract

Condensed datasets offer a compact representation of larger datasets, but training models directly on them or using them to enhance model performance through knowledge distillation (KD) can result in suboptimal outcomes due to limited information. To address this, we propose a method that expands condensed datasets using model inversion, a technique for generating synthetic data based on the impressions of a pre-trained model on its training data. This approach is particularly well-suited for KD scenarios, as the teacher model is already pre-trained and retains knowledge of the original training data. By creating synthetic data that complements the condensed samples, we enrich the training set and better approximate the underlying data distribution, leading to improvements in student model accuracy during knowledge distillation. Our method demonstrates significant gains in KD accuracy compared to using condensed datasets alone and outperforms standard model inversion-based KD methods by up to 11.4% across various datasets and model architectures. Importantly, it remains effective even when using as few as one condensed sample per class, and can also enhance performance in few-shot scenarios where only limited real data samples are available.

Condensed Data Expansion Using Model Inversion for Knowledge Distillation

TL;DR

The paper tackles the limited information content of condensed datasets for knowledge distillation by proposing a condensed data expansion approach guided by model inversion. It introduces a feature-alignment discriminator that conditions synthetic data on condensed prototypes, enabling synthetic samples to closely reflect the underlying data distribution and reduce domain gaps. Empirical results across CIFAR-10/100 and ImageNet-200 show consistent KD improvements, with gains up to around 11.4 percentage points, and effectiveness even with minimal per-class condensed samples or real-data in few-shot settings. The method is compatible with existing MI techniques and strengthens KD in heterogeneous model pairs, privacy-preserving contexts, and data-scarce regimes, offering practical benefits for compressed-data KD pipelines.

Abstract

Condensed datasets offer a compact representation of larger datasets, but training models directly on them or using them to enhance model performance through knowledge distillation (KD) can result in suboptimal outcomes due to limited information. To address this, we propose a method that expands condensed datasets using model inversion, a technique for generating synthetic data based on the impressions of a pre-trained model on its training data. This approach is particularly well-suited for KD scenarios, as the teacher model is already pre-trained and retains knowledge of the original training data. By creating synthetic data that complements the condensed samples, we enrich the training set and better approximate the underlying data distribution, leading to improvements in student model accuracy during knowledge distillation. Our method demonstrates significant gains in KD accuracy compared to using condensed datasets alone and outperforms standard model inversion-based KD methods by up to 11.4% across various datasets and model architectures. Importantly, it remains effective even when using as few as one condensed sample per class, and can also enhance performance in few-shot scenarios where only limited real data samples are available.
Paper Structure (20 sections, 5 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 20 sections, 5 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of our motivation for using condensed samples as prototypes for synthetic data.
  • Figure 2: (a) Overview of our condensed-samples guided GMI (generative model inversion) framework. The discriminator is optimized to distinguish real and fake features, while the generator tries to prevent it from doing so by aligning them. Generator, discriminator, and student models are trained in alternate steps. (b) Illustration of our motivation for using condensed samples as templates for synthetic data.
  • Figure 3: 2D visualisation of feature vectors. Faded and bold markers denote feature space projections of real samples from the CIFAR-10 dataset and synthetic samples, respectively. Condensed data-guided synthetic samples exhibit better alignment with the real data distribution.
  • Figure 4: Students distilled by only using samples from model inversion (left), and using expanded data generated by our method (right) for different condensed dataset set sizes. Condensed datasets with 1, 10, and 50 spc from CIFAR10, and 1 and 10 spc from CIFAR100, were used.
  • Figure 5: First two rows contain synthetic CIFAR-100 samples obtained with and w/o condensed data-guided model inversion. The last two show condensed and real samples.