Table of Contents
Fetching ...

Post-Pruning Accuracy Recovery via Data-Free Knowledge Distillation

Chinmay Tripurwar, Utkarsh Maurya, Dishant

TL;DR

This work tackles privacy-preserving model compression by enabling data-free accuracy recovery after pruning. It combines DeepInversion–driven synthetic data generation, by matching BatchNorm statistics, with knowledge distillation from a pre-trained teacher to a pruned student, achieving near-teacher accuracy on CIFAR-10 across ResNet variants at 75% pruning. Deeper networks show greater robustness to pruning, and the synthetic data effectively facilitates distillation, with recovered performance saturating within about 1% of the teacher. The approach offers a practical, privacy-friendly pathway to deploy compressed neural networks on edge devices, with open avenues for GAN-based data diversification and extensions to structured pruning.

Abstract

Model pruning is a widely adopted technique to reduce the computational complexity and memory footprint of Deep Neural Networks (DNNs). However, global unstructured pruning often leads to significant degradation in accuracy, typically necessitating fine-tuning on the original training dataset to recover performance. In privacy-sensitive domains such as healthcare or finance, access to the original training data is often restricted post-deployment due to regulations (e.g., GDPR, HIPAA). This paper proposes a Data-Free Knowledge Distillation framework to bridge the gap between model compression and data privacy. We utilize DeepInversion to synthesize privacy-preserving ``dream'' images from the pre-trained teacher model by inverting Batch Normalization (BN) statistics. These synthetic images serve as a transfer set to distill knowledge from the original teacher to the pruned student network. Experimental results on CIFAR-10 across various architectures (ResNet, MobileNet, VGG) demonstrate that our method significantly recovers accuracy lost during pruning without accessing a single real data point.

Post-Pruning Accuracy Recovery via Data-Free Knowledge Distillation

TL;DR

This work tackles privacy-preserving model compression by enabling data-free accuracy recovery after pruning. It combines DeepInversion–driven synthetic data generation, by matching BatchNorm statistics, with knowledge distillation from a pre-trained teacher to a pruned student, achieving near-teacher accuracy on CIFAR-10 across ResNet variants at 75% pruning. Deeper networks show greater robustness to pruning, and the synthetic data effectively facilitates distillation, with recovered performance saturating within about 1% of the teacher. The approach offers a practical, privacy-friendly pathway to deploy compressed neural networks on edge devices, with open avenues for GAN-based data diversification and extensions to structured pruning.

Abstract

Model pruning is a widely adopted technique to reduce the computational complexity and memory footprint of Deep Neural Networks (DNNs). However, global unstructured pruning often leads to significant degradation in accuracy, typically necessitating fine-tuning on the original training dataset to recover performance. In privacy-sensitive domains such as healthcare or finance, access to the original training data is often restricted post-deployment due to regulations (e.g., GDPR, HIPAA). This paper proposes a Data-Free Knowledge Distillation framework to bridge the gap between model compression and data privacy. We utilize DeepInversion to synthesize privacy-preserving ``dream'' images from the pre-trained teacher model by inverting Batch Normalization (BN) statistics. These synthetic images serve as a transfer set to distill knowledge from the original teacher to the pruned student network. Experimental results on CIFAR-10 across various architectures (ResNet, MobileNet, VGG) demonstrate that our method significantly recovers accuracy lost during pruning without accessing a single real data point.

Paper Structure

This paper contains 22 sections, 6 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: Visualization of Neural Network Pruning. The left diagram represents a dense network, while the right represents the sparse network after global unstructured pruning.
  • Figure 2: Overview of the Knowledge Distillation process where the Teacher guides the Student network.
  • Figure 3: DeepInversion Pipeline
  • Figure 4: The proposed pipeline: (A) Teacher Inversion to create synthetic data, followed by (B) Student Recovery via Distillation.
  • Figure 5: Visualization of synthetic images recovered from ResNet50 BN statistics. While not photorealistic, these images capture the dominant frequency and color statistics required for the student to learn decision boundaries.
  • ...and 1 more figures