Controlling the Output of a Generative Model by Latent Feature Vector Shifting

Róbert Belanec; Peter Lacko; Kristína Malinovská

Controlling the Output of a Generative Model by Latent Feature Vector Shifting

Róbert Belanec, Peter Lacko, Kristína Malinovská

TL;DR

This work uses a pre-trained model of StyleGAN3 that generates images of realistic human faces in relatively high resolution and combines the model with a convolutional neural network classifier trained to classify the generated images with binary facial features from the CelebA dataset.

Abstract

State-of-the-art generative models (e.g. StyleGAN3 \cite{karras2021alias}) often generate photorealistic images based on vectors sampled from their latent space. However, the ability to control the output is limited. Here we present our novel method for latent vector shifting for controlled output image modification utilizing semantic features of the generated images. In our approach we use a pre-trained model of StyleGAN3 that generates images of realistic human faces in relatively high resolution. We complement the generative model with a convolutional neural network classifier, namely ResNet34, trained to classify the generated images with binary facial features from the CelebA dataset. Our latent feature shifter is a neural network model with a task to shift the latent vectors of a generative model into a specified feature direction. We have trained latent feature shifter for multiple facial features, and outperformed our baseline method in the number of generated images with the desired feature. To train our latent feature shifter neural network, we have designed a dataset of pairs of latent vectors with and without a certain feature. Based on the evaluation, we conclude that our latent feature shifter approach was successful in the controlled generation of the StyleGAN3 generator.

Controlling the Output of a Generative Model by Latent Feature Vector Shifting

TL;DR

Abstract

Paper Structure (15 sections, 8 figures, 6 tables)

This paper contains 15 sections, 8 figures, 6 tables.

Introduction
Related work
Generative adversarial networks
Image classification
Semantic feature control
Shifted latent vectors dataset
Generating unconditioned facial images using StyleGAN3
Classifying generated images
Feature axis regression
Dataset compostion
Our latent feature shifting approach
Latent feature shifter design and training
Preliminary evaluation
Evaluation using ResNet34 classifier
Conclusion

Figures (8)

Figure 1: Example of generating four pairs samples of shifted latent vectors dataset from the original image (with green border).
Figure 2: Diagram representing the process of generating a latent vector that will be shifted by each model trained on a different feature dataset, which should result in a latent vector representing all of the required features. The plus sign represent a vector concatenation operation.
Figure 3: Average validation MSE loss development over ten training epochs for five different shifting latent vector model architectures (a-e).
Figure 4: Results of adding eyeglasses feature to eleven random vectors. Each row represents a different approach to adding the feature.
Figure 5: Results of adding the male feature to eleven random vectors. Each row represents a different approach to adding the feature.
...and 3 more figures

Controlling the Output of a Generative Model by Latent Feature Vector Shifting

TL;DR

Abstract

Controlling the Output of a Generative Model by Latent Feature Vector Shifting

Authors

TL;DR

Abstract

Table of Contents

Figures (8)