Table of Contents
Fetching ...

Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation

Sangyeop Yeo, Yoojin Jang, Jaejun Yoo

TL;DR

This paper tackles deploying GANs in resource‑constrained settings by introducing two complementary methods, DiME and NICKEL. DiME performs distribution matching in embedding spaces via Maximum Mean Discrepancy using foundation kernels to align $G^T$ and $G^S$, while NICKEL enhances stability by transferring knowledge through the discriminator ($D^S$) and its feedback to the student generator ($G^S$). On StyleGAN2 with FFHQ, the approach yields state‑of‑the‑art compression results (e.g., FID up to 15.93 at 98.92% compression and 29.38 at 99.69%), maintaining perceptual quality at extreme pruning. Collectively, these methods significantly reduce GAN compute while preserving high image fidelity, enabling practical deployment in limited‑resource environments.

Abstract

In this paper, we address the challenge of compressing generative adversarial networks (GANs) for deployment in resource-constrained environments by proposing two novel methodologies: Distribution Matching for Efficient compression (DiME) and Network Interactive Compression via Knowledge Exchange and Learning (NICKEL). DiME employs foundation models as embedding kernels for efficient distribution matching, leveraging maximum mean discrepancy to facilitate effective knowledge distillation. Simultaneously, NICKEL employs an interactive compression method that enhances the communication between the student generator and discriminator, achieving a balanced and stable compression process. Our comprehensive evaluation on the StyleGAN2 architecture with the FFHQ dataset shows the effectiveness of our approach, with NICKEL & DiME achieving FID scores of 10.45 and 15.93 at compression rates of 95.73% and 98.92%, respectively. Remarkably, our methods sustain generative quality even at an extreme compression rate of 99.69%, surpassing the previous state-of-the-art performance by a large margin. These findings not only demonstrate our methodologies' capacity to significantly lower GANs' computational demands but also pave the way for deploying high-quality GAN models in settings with limited resources. Our code will be released soon.

Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation

TL;DR

This paper tackles deploying GANs in resource‑constrained settings by introducing two complementary methods, DiME and NICKEL. DiME performs distribution matching in embedding spaces via Maximum Mean Discrepancy using foundation kernels to align and , while NICKEL enhances stability by transferring knowledge through the discriminator () and its feedback to the student generator (). On StyleGAN2 with FFHQ, the approach yields state‑of‑the‑art compression results (e.g., FID up to 15.93 at 98.92% compression and 29.38 at 99.69%), maintaining perceptual quality at extreme pruning. Collectively, these methods significantly reduce GAN compute while preserving high image fidelity, enabling practical deployment in limited‑resource environments.

Abstract

In this paper, we address the challenge of compressing generative adversarial networks (GANs) for deployment in resource-constrained environments by proposing two novel methodologies: Distribution Matching for Efficient compression (DiME) and Network Interactive Compression via Knowledge Exchange and Learning (NICKEL). DiME employs foundation models as embedding kernels for efficient distribution matching, leveraging maximum mean discrepancy to facilitate effective knowledge distillation. Simultaneously, NICKEL employs an interactive compression method that enhances the communication between the student generator and discriminator, achieving a balanced and stable compression process. Our comprehensive evaluation on the StyleGAN2 architecture with the FFHQ dataset shows the effectiveness of our approach, with NICKEL & DiME achieving FID scores of 10.45 and 15.93 at compression rates of 95.73% and 98.92%, respectively. Remarkably, our methods sustain generative quality even at an extreme compression rate of 99.69%, surpassing the previous state-of-the-art performance by a large margin. These findings not only demonstrate our methodologies' capacity to significantly lower GANs' computational demands but also pave the way for deploying high-quality GAN models in settings with limited resources. Our code will be released soon.
Paper Structure (19 sections, 4 equations, 6 figures, 4 tables)

This paper contains 19 sections, 4 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Comparison of knowledge distillation methods. (a) In classification tasks, the instance matching of output labels between the teacher and student is performed. Output labels are in low-dimensional space. Ideally, the outputs of the student and teacher are the same. (b) In conditional generative tasks, the instance matching of output images between the teacher and student is performed. Output images are in high dimensional space. The outputs of the student and teacher are similar (in terms of structure or background). (c) In unconditional generative tasks, the distribution matching of output images between the teacher and student is performed. There is no necessity for each input to have the same output.
  • Figure 2: A schematic overview of our method. Our method consists of (a) Distribution Matching for Efficient compression (DiME), (b) Network Interactive Compression via Knowledge Exchange and Learning (NICKEL), and (c) adversarial loss. (a) matches the outputs between the teacher generator ($G^T$) and the student generator ($G^S$) via the foundation model $\phi$ in the embedding space. (b) matches the intermediate features between the teacher generator and the student discriminator ($D^S$). (c) represents the adversarial loss between the student generator and both the teacher discriminator ($D^T$) and the student discriminator.
  • Figure 3: Comparison of stability of ours and state-of-the-art compression methods. (a) indicates the logits of the discriminator for the pruned generator on ITGCkang2022information. The green solid line represents the ideal equilibrium state. When the compression rate is 98.92$\%$ (blue dash line), it shows a more severe imbalance state compared to when the compression rate is 90.73$\%$ (red dash line). (b) indicates the logits of the discriminator for the pruned generator on NICKEL & DiME. Our method mitigates the imbalance between the discriminator and the pruned generator. (c) indicates the FID convergence plot when the compression rate is 98.92%. NICKEL & DiME converges the most stably.
  • Figure 4: Performance comparison as a function of compression rates on StyleGAN2 for FFHQ. (a) indicates a function showing how FID varies with compression rates. NICKEL & DiME consistently outperforms other state-of-the-art compression methods at various compression rates. At a compression rate of 74.96%, NICKEL & DiME shows only 9.68% performance degradation compared to the full model, and the performance degradation due to increasing compression rates occurs less than other state-of-the-art compression methods. (b) indicates a function showing how Precision varies with compression rates. NICKEL & DiME shows comparable fidelity scores to other methods. (c) indicates a function showing how Recall varies with compression rates. NICKEL & DiME shows better preservation of diversity compared to other methods, even with higher compression rates.
  • Figure 5: Visualization of images generated by compressed StyleGAN2 on FFHQ and LSUN-CAT. (a) shows the visual quality of StyleGAN2 compressed by NICKEL & DiME on FFHQ at compression rate = 90.73%. (b) shows the visual quality of StyleGAN2 compressed by NICKEL & DiME on LSUN-CAT at compression rate = 90.73%.
  • ...and 1 more figures