Improving Inversion and Generation Diversity in StyleGAN using a Gaussianized Latent Space

Jonas Wulff; Antonio Torralba

Improving Inversion and Generation Diversity in StyleGAN using a Gaussianized Latent Space

Jonas Wulff, Antonio Torralba

TL;DR

The paper tackles the instability of inverting and interpolating within StyleGAN's latent spaces by introducing a Gaussian prior on a transformed latent space. By applying a simple Leaky ReLU transformation to obtain a Gaussian-distributed latent, the authors enable an analytic prior described by a mean and covariance, which improves inversion quality and smoothness of interpolations for both $\mathcal{W}$ and $\mathcal{W}^{+}$. They further exploit this Gaussian model to analyze and reduce generator artifacts via PCA and a logarithmic compression of dominant components, offering an alternative to truncation that preserves diversity. Overall, the work provides a principled framework for stable inversion and artifact mitigation in high-fidelity GANs with practical impact on editing and dataset generation.

Abstract

Modern Generative Adversarial Networks are capable of creating artificial, photorealistic images from latent vectors living in a low-dimensional learned latent space. It has been shown that a wide range of images can be projected into this space, including images outside of the domain that the generator was trained on. However, while in this case the generator reproduces the pixels and textures of the images, the reconstructed latent vectors are unstable and small perturbations result in significant image distortions. In this work, we propose to explicitly model the data distribution in latent space. We show that, under a simple nonlinear operation, the data distribution can be modeled as Gaussian and therefore expressed using sufficient statistics. This yields a simple Gaussian prior, which we use to regularize the projection of images into the latent space. The resulting projections lie in smoother and better behaved regions of the latent space, as shown using interpolation performance for both real and generated images. Furthermore, the Gaussian model of the distribution in latent space allows us to investigate the origins of artifacts in the generator output, and provides a method for reducing these artifacts while maintaining diversity of the generated images.

Improving Inversion and Generation Diversity in StyleGAN using a Gaussianized Latent Space

TL;DR

Abstract

Improving Inversion and Generation Diversity in StyleGAN using a Gaussianized Latent Space

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)