WeditGAN: Few-Shot Image Generation via Latent Space Relocation

Yuxuan Duan; Li Niu; Yan Hong; Liqing Zhang

WeditGAN: Few-Shot Image Generation via Latent Space Relocation

Yuxuan Duan, Li Niu, Yan Hong, Liqing Zhang

TL;DR

WeditGAN tackles few-shot image generation by transferring a pretrained StyleGAN through latent-space relocation using a fixed offset $\Delta w$, relocating the target latent space $W_{\text{tgt}}^+$ from the source $W_{\text{src}}^+$. By freezing the mapping and synthesis networks and only learning $\Delta w$, the approach preserves source-domain diversity while achieving target-domain fidelity; it further extends to layer-wise $W^+$ with AlphaModules for per-$w_{src}$ editing and adds optional perpendicular regularization and contrastive-based regularization to improve robustness. The method shows state-of-the-art performance across eight source/target pairs in 10-shot settings, offering competitive fidelity and high diversity without extensive parameter updates. The work demonstrates a simple, effective transfer mechanism that leverages the geometry of StyleGAN latent spaces, with open-source code and clear guidance for extensions in per-layer relocation, editing intensity, and orthogonality to reg-based methods.

Abstract

In few-shot image generation, directly training GAN models on just a handful of images faces the risk of overfitting. A popular solution is to transfer the models pretrained on large source domains to small target ones. In this work, we introduce WeditGAN, which realizes model transfer by editing the intermediate latent codes $w$ in StyleGANs with learned constant offsets ($Δw$), discovering and constructing target latent spaces via simply relocating the distribution of source latent spaces. The established one-to-one mapping between latent spaces can naturally prevents mode collapse and overfitting. Besides, we also propose variants of WeditGAN to further enhance the relocation process by regularizing the direction or finetuning the intensity of $Δw$. Experiments on a collection of widely used source/target datasets manifest the capability of WeditGAN in generating realistic and diverse images, which is simple yet highly effective in the research area of few-shot image generation. Codes are available at https://github.com/Ldhlwh/WeditGAN.

WeditGAN: Few-Shot Image Generation via Latent Space Relocation

TL;DR

WeditGAN tackles few-shot image generation by transferring a pretrained StyleGAN through latent-space relocation using a fixed offset

, relocating the target latent space

from the source

. By freezing the mapping and synthesis networks and only learning

, the approach preserves source-domain diversity while achieving target-domain fidelity; it further extends to layer-wise

with AlphaModules for per-

editing and adds optional perpendicular regularization and contrastive-based regularization to improve robustness. The method shows state-of-the-art performance across eight source/target pairs in 10-shot settings, offering competitive fidelity and high diversity without extensive parameter updates. The work demonstrates a simple, effective transfer mechanism that leverages the geometry of StyleGAN latent spaces, with open-source code and clear guidance for extensions in per-layer relocation, editing intensity, and orthogonality to reg-based methods.

Abstract

in StyleGANs with learned constant offsets (

), discovering and constructing target latent spaces via simply relocating the distribution of source latent spaces. The established one-to-one mapping between latent spaces can naturally prevents mode collapse and overfitting. Besides, we also propose variants of WeditGAN to further enhance the relocation process by regularizing the direction or finetuning the intensity of

. Experiments on a collection of widely used source/target datasets manifest the capability of WeditGAN in generating realistic and diverse images, which is simple yet highly effective in the research area of few-shot image generation. Codes are available at https://github.com/Ldhlwh/WeditGAN.

Paper Structure (42 sections, 11 equations, 10 figures, 7 tables)

This paper contains 42 sections, 11 equations, 10 figures, 7 tables.

Introduction
Related Work
Latent Space Manipulation
Few-shot Image Generation
Few-shot Domain Adaptation
Method
StyleGAN Preliminary
WeditGAN
Extended Latent Space
Latent Space Relocation
Objective
WeditGAN Variants
Perpendicular Regularization
Editing Intensity Finetuning
Orthogonality to Regularization-based Methods
...and 27 more sections

Figures (10)

Figure 1: The core idea of latent space relocation with constant latent codes $\Delta w$, based on the fact that the latent spaces of related domains in the same generative model share similar shapes of manifolds.
Figure 2: The procedure of WeditGAN. Left: A StyleGAN is first trained on a large source dataset. Middle: During the transfer process, the mapping and the synthesis network are both fixed. With the target latent code $w_\mathrm{tgt}$ constructed by summing up $w_\mathrm{src}$ and the only trainable parameters $\Delta w$, the synthesis network can generate images of the target domain. Right: After $\Delta w$ is learned and fixed, WeditGAN trains a set of AlphaModules to finetune the editing intensity customized for each $w_\mathrm{src}$ (optional).
Figure 3: The 10-shot datasets (left), the generated samples of source domain FFHQ (top), target domain Sketches (middle) and Babies (bottom). Generated samples in each column are generated with the same random input $z$.
Figure 4: Visual comparisons between WeditGAN and its three variants. Left: WeditGAN perp on Sunglasses. Middle: WeditGAN alpha on Babies. $\alpha4, \alpha8, \dots, \alpha256$ are the intensity residuals corresponding to the synthesis blocks at resolution $4, 8, \dots, 256$. Right: WeditGAN CL on Sketches.
Figure 5: Samples generated by WeditGAN using $\Delta w$ learned on Sketches, Babies or Sunglasses with different intensities ranging from $-0.25$ to $1.0$.
...and 5 more figures

WeditGAN: Few-Shot Image Generation via Latent Space Relocation

TL;DR

Abstract

WeditGAN: Few-Shot Image Generation via Latent Space Relocation

Authors

TL;DR

Abstract

Table of Contents

Figures (10)