StyleAutoEncoder for manipulating image attributes using pre-trained StyleGAN

Andrzej Bedychaj; Jacek Tabor; Marek Śmieja

StyleAutoEncoder for manipulating image attributes using pre-trained StyleGAN

Andrzej Bedychaj, Jacek Tabor, Marek Śmieja

TL;DR

StyleAE introduces a lightweight AutoEncoder plugin that attaches to a pre-trained StyleGAN to enable targeted attribute manipulation directly in the StyleGAN latent space $W$, by learning a structured mapping to a latent $(C,S)$ where $C=(C_1,...,C_K)$ encodes labeled attributes and $S$ captures remaining information. The encoder $\mathcal{E}$ extracts attributes from $w$, while the decoder $\mathcal{D}$ reconstructs $w$ to preserve image quality, trained with a dual loss that penalizes reconstruction error and deviations from target attributes, including a specialized $d_A^S$ term for binary attributes. Compared to flow-based approaches like StyleFlow and PluGeN, StyleAE is simpler and significantly more computationally efficient, yet achieves comparable attribute manipulation accuracy and superior preservation of other image features on FFHQ and AFHQv2 datasets. The results demonstrate that a straightforward AutoEncoder framework can yield effective, controllable attribute edits with faster training and inference, broadening practical deployment of attribute-conditioned image editing across diverse domains. Future work includes improving latent-space disentanglement and extending StyleAE to other generative backbones beyond StyleGAN.

Abstract

Deep conditional generative models are excellent tools for creating high-quality images and editing their attributes. However, training modern generative models from scratch is very expensive and requires large computational resources. In this paper, we introduce StyleAutoEncoder (StyleAE), a lightweight AutoEncoder module, which works as a plugin for pre-trained generative models and allows for manipulating the requested attributes of images. The proposed method offers a cost-effective solution for training deep generative models with limited computational resources, making it a promising technique for a wide range of applications. We evaluate StyleAutoEncoder by combining it with StyleGAN, which is currently one of the top generative models. Our experiments demonstrate that StyleAutoEncoder is at least as effective in manipulating image attributes as the state-of-the-art algorithms based on invertible normalizing flows. However, it is simpler, faster, and gives more freedom in designing neural

StyleAutoEncoder for manipulating image attributes using pre-trained StyleGAN

TL;DR

StyleAE introduces a lightweight AutoEncoder plugin that attaches to a pre-trained StyleGAN to enable targeted attribute manipulation directly in the StyleGAN latent space

, by learning a structured mapping to a latent

where

encodes labeled attributes and

captures remaining information. The encoder

extracts attributes from

, while the decoder

reconstructs

to preserve image quality, trained with a dual loss that penalizes reconstruction error and deviations from target attributes, including a specialized

term for binary attributes. Compared to flow-based approaches like StyleFlow and PluGeN, StyleAE is simpler and significantly more computationally efficient, yet achieves comparable attribute manipulation accuracy and superior preservation of other image features on FFHQ and AFHQv2 datasets. The results demonstrate that a straightforward AutoEncoder framework can yield effective, controllable attribute edits with faster training and inference, broadening practical deployment of attribute-conditioned image editing across diverse domains. Future work includes improving latent-space disentanglement and extending StyleAE to other generative backbones beyond StyleGAN.

Abstract

Paper Structure (22 sections, 4 equations, 4 figures, 5 tables)

This paper contains 22 sections, 4 equations, 4 figures, 5 tables.

Introduction
Related Work
Methodology
Preliminaries
StyleGANkarras2019stylebasedkarras2020analyzingkarras2021aliasfree:
AutoEncoder:
StyleAutoEncoder
Structure of the target space:
Loss function:
Discussion
Image Editing:
Related models:
Evaluation metrics
Models implementation
Manipulation of facial features
...and 7 more sections

Figures (4)

Figure 1: Single-attribute manipulation with StyleAE in StyleGAN's latent space.
Figure 2: Architecture design of StyleAE. StyleAE maps the style code $w$ of the pre-trained StyleGAN into a target space, where labelled attributes are modelled by individual coordinates.
Figure 3: Examples of attributes modification for all of the tested models on FFHQ dataset. One can observe that StyleAE correctly modifies the requested attributes and is less invasive to the remaining characteristics of the image than the competitive flow-based methods.
Figure 4: Attribute modification on a sample image generated from StyleGAN. The generated images by all models exhibit successful changes in the manipulated attributes while maintaining the overall coherence of the image. Our findings indicate that the performance of StyleAE method is comparable to state-of-the-art flow-based models in producing effective attribute manipulation.

StyleAutoEncoder for manipulating image attributes using pre-trained StyleGAN

TL;DR

Abstract

StyleAutoEncoder for manipulating image attributes using pre-trained StyleGAN

Authors

TL;DR

Abstract

Table of Contents

Figures (4)