Table of Contents
Fetching ...

Parameter and Data-Efficient Spectral StyleDCGAN

Aryan Garg

TL;DR

This work tackles the high data and parameter demands of state-of-the-art GANs by introducing Spectral Style-DCGAN (SSD), a parameter- and data-efficient unconditional GAN trained on a tiny AFHQ Dog subset to generate 64x64 faces. The approach combines a lightweight 100-dim style mapping (via a 4-layer MLP) to produce a disentangled W-space used by AdaIN in a five-layer upsampling generator, with a spectral-normalized discriminator as a key regularizer. SSD achieves competitive fidelity with only 6.57M parameters, substantially fewer than StyleGAN, and demonstrates data-efficiency through ablations and clean-FID metrics, highlighting the role of spectral normalization in enabling meaningful generator learning and latent disentanglement. The work offers practical implications for deploying high-quality GANs in data-scarce domains and provides open-source code for reproducibility and further research.

Abstract

We present a simple, highly parameter, and data-efficient adversarial network for unconditional face generation. Our method: Spectral Style-DCGAN or SSD utilizes only 6.574 million parameters and 4739 dog faces from the Animal Faces HQ (AFHQ) dataset as training samples while preserving fidelity at low resolutions up to 64x64. Code available at https://github.com/Aryan-Garg/StyleDCGAN.

Parameter and Data-Efficient Spectral StyleDCGAN

TL;DR

This work tackles the high data and parameter demands of state-of-the-art GANs by introducing Spectral Style-DCGAN (SSD), a parameter- and data-efficient unconditional GAN trained on a tiny AFHQ Dog subset to generate 64x64 faces. The approach combines a lightweight 100-dim style mapping (via a 4-layer MLP) to produce a disentangled W-space used by AdaIN in a five-layer upsampling generator, with a spectral-normalized discriminator as a key regularizer. SSD achieves competitive fidelity with only 6.57M parameters, substantially fewer than StyleGAN, and demonstrates data-efficiency through ablations and clean-FID metrics, highlighting the role of spectral normalization in enabling meaningful generator learning and latent disentanglement. The work offers practical implications for deploying high-quality GANs in data-scarce domains and provides open-source code for reproducibility and further research.

Abstract

We present a simple, highly parameter, and data-efficient adversarial network for unconditional face generation. Our method: Spectral Style-DCGAN or SSD utilizes only 6.574 million parameters and 4739 dog faces from the Animal Faces HQ (AFHQ) dataset as training samples while preserving fidelity at low resolutions up to 64x64. Code available at https://github.com/Aryan-Garg/StyleDCGAN.
Paper Structure (20 sections, 7 figures, 3 tables)

This paper contains 20 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Spectral normalized discriminator improvement synthesis
  • Figure 2: Ablation: No spectral normalization. Style Mapping and AdaIN synthesis
  • Figure 3: Training dataset samples at original 512x512 resolution.
  • Figure 4: Ablation study: Effect of spectral normalization layers in the discriminator. The left column represents training loss plots of the discriminator(top) and generator(bottom) with the spectral normalization layers in the discriminator; while the right column demonstrates the ablated-discriminator-training loss plots.
  • Figure 5: Ablation: Disentangling the latent vectors. Style-Mapping without AdaIN Synthesis
  • ...and 2 more figures