Table of Contents
Fetching ...

Few-Shot Adaptation of Generative Adversarial Networks

Esther Robb, Wen-Sheng Chu, Abhishek Kumar, Jia-Bin Huang

TL;DR

This paper tackles the problem of adapting pretrained GANs to new domains when target data are scarce (less than 100 images). It introduces Few-Shot GAN (FSGAN), which reformulates adaptation in the weight space by applying singular value decomposition to pretrained weights and learning only the singular values while keeping the left and right singular vectors fixed, i.e., $W^{(\ell)}_{\Sigma} = (U_0^{(\ell)} \Sigma V_0^{(\ell)\top})^{(\ell)}$. The method is evaluated on near- and far-domain transfers using 5–100 shots, showing improved visual quality over baselines like TransferGAN, FreezeD, and SSGAN, while highlighting limitations of the Fréchet Inception Distance (FID) as a sole metric in few-shot settings and supplementing with sharpness and Face Quality Index. Key contributions include a simple yet expressive adaptation space, per-layer SVD application to both generator and discriminator, and a demonstration that restricting parameter updates to singular values yields more stable and diverse outputs under severe data constraints. The work has practical impact for efficient deployment of high-fidelity GANs in data-scarce scenarios and provides a framework for more robust evaluation in low-data regimes.

Abstract

Generative Adversarial Networks (GANs) have shown remarkable performance in image synthesis tasks, but typically require a large number of training samples to achieve high-quality synthesis. This paper proposes a simple and effective method, Few-Shot GAN (FSGAN), for adapting GANs in few-shot settings (less than 100 images). FSGAN repurposes component analysis techniques and learns to adapt the singular values of the pre-trained weights while freezing the corresponding singular vectors. This provides a highly expressive parameter space for adaptation while constraining changes to the pretrained weights. We validate our method in a challenging few-shot setting of 5-100 images in the target domain. We show that our method has significant visual quality gains compared with existing GAN adaptation methods. We report qualitative and quantitative results showing the effectiveness of our method. We additionally highlight a problem for few-shot synthesis in the standard quantitative metric used by data-efficient image synthesis works. Code and additional results are available at http://e-271.github.io/few-shot-gan.

Few-Shot Adaptation of Generative Adversarial Networks

TL;DR

This paper tackles the problem of adapting pretrained GANs to new domains when target data are scarce (less than 100 images). It introduces Few-Shot GAN (FSGAN), which reformulates adaptation in the weight space by applying singular value decomposition to pretrained weights and learning only the singular values while keeping the left and right singular vectors fixed, i.e., . The method is evaluated on near- and far-domain transfers using 5–100 shots, showing improved visual quality over baselines like TransferGAN, FreezeD, and SSGAN, while highlighting limitations of the Fréchet Inception Distance (FID) as a sole metric in few-shot settings and supplementing with sharpness and Face Quality Index. Key contributions include a simple yet expressive adaptation space, per-layer SVD application to both generator and discriminator, and a demonstration that restricting parameter updates to singular values yields more stable and diverse outputs under severe data constraints. The work has practical impact for efficient deployment of high-fidelity GANs in data-scarce scenarios and provides a framework for more robust evaluation in low-data regimes.

Abstract

Generative Adversarial Networks (GANs) have shown remarkable performance in image synthesis tasks, but typically require a large number of training samples to achieve high-quality synthesis. This paper proposes a simple and effective method, Few-Shot GAN (FSGAN), for adapting GANs in few-shot settings (less than 100 images). FSGAN repurposes component analysis techniques and learns to adapt the singular values of the pre-trained weights while freezing the corresponding singular vectors. This provides a highly expressive parameter space for adaptation while constraining changes to the pretrained weights. We validate our method in a challenging few-shot setting of 5-100 images in the target domain. We show that our method has significant visual quality gains compared with existing GAN adaptation methods. We report qualitative and quantitative results showing the effectiveness of our method. We additionally highlight a problem for few-shot synthesis in the standard quantitative metric used by data-efficient image synthesis works. Code and additional results are available at http://e-271.github.io/few-shot-gan.

Paper Structure

This paper contains 13 sections, 4 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Few-shot image generation. Our method generates novel and high-quality samples in a new domain with a small amount of training data. (Top) Diverse random samples from adapting a FFHQ-pretrained StyleGAN2 to toddler images from the CelebA dataset (with only 30 images) using our method. (Bottom) Smooth latent space interpolation between two random seeds shows that our method produces novel samples instead of simply memorizing the 30 images. Please see the supplementary video for more results.
  • Figure 2: Comparing methods for GAN adaption. Learnable parameters are denoted in red. (a) TransferGAN (TGAN for simplicity) wang2018eccv and FreezeD mo2020freeze retrain all weights $W$ in a layer. SSGAN noguchi2019iccv and FSGAN train significantly fewer parameters per layer. Note FSGAN adapts both conv and FC layers, while SSGAN adapts only conv layers. #params is the number of learnable paramaters per conv layer; Count gives parameter counts over the full StyleGAN2 generator and discriminator. (b) FSGAN (ours) adapts singular values $\Sigma=\{\sigma_1,\dots,\sigma_s\}$ of pretrained weights $W_0$ to obtain adapted weights $W_\Sigma$.
  • Figure 3: Effects of singular values. We visualize FSGAN's adaptation space by magnifying the top 3 singular values $\sigma_0, \sigma_1, \sigma_2$ from SVD performed on style and conv layers of a StyleGAN2 karras2019stylekarras2019analyzing pretrained on FFHQ. In mapping layer 4 ($\text{style}_4$), the leading $\sigma$s change the age, skin tone, and head pose. In synthesis layer 2 ($\text{conv}_{8\times 8}$), face dimensions are modified in term of face height/size/width. In synthesis layer 9 ($\text{conv}_{1024 \times 1024}$), the face appearance changes in finer pixel stats such as saturation, contrast, and color balance.
  • Figure 4: Problem with FID as a few-shot metric. TGAN wang2018eccv adaptation from English characters to 10-shot Kannada characters ( Bottom) de2009character. The adaptation process is illustrated by interpolating two random latent vectors at different timesteps (t=20 means 20K images seen during training). We measure FID against a 2K-image Kannada set, from which the 10 images was sampled. The interpolation shows larger timesteps (t) tend to memorize the 10-image training set while yielding lower FID, revealing that FID favors overfitting and is not suitable for the few-shot setting.
  • Figure 5: Close-domain adaptation (FFHQ$\rightarrow$CelebA). Models adapted from a pretrained StyleGAN2 using $\sim$30 target images (left-most column) of (a) CelebA ID 4978 and (b) CelebA ID 3719. The proposed FSGAN generates more natural face images without noticeable artifacts. Comparison methods include TGAN wang2018eccv, FD mo2020freeze, SSGAN noguchi2019iccv, trained with a limited number of timesteps to prevent overfitting or degradation.
  • ...and 2 more figures