Table of Contents
Fetching ...

StyleAutoEncoder for manipulating image attributes using pre-trained StyleGAN

Andrzej Bedychaj, Jacek Tabor, Marek Śmieja

TL;DR

StyleAE introduces a lightweight AutoEncoder plugin that attaches to a pre-trained StyleGAN to enable targeted attribute manipulation directly in the StyleGAN latent space $W$, by learning a structured mapping to a latent $(C,S)$ where $C=(C_1,...,C_K)$ encodes labeled attributes and $S$ captures remaining information. The encoder $\mathcal{E}$ extracts attributes from $w$, while the decoder $\mathcal{D}$ reconstructs $w$ to preserve image quality, trained with a dual loss that penalizes reconstruction error and deviations from target attributes, including a specialized $d_A^S$ term for binary attributes. Compared to flow-based approaches like StyleFlow and PluGeN, StyleAE is simpler and significantly more computationally efficient, yet achieves comparable attribute manipulation accuracy and superior preservation of other image features on FFHQ and AFHQv2 datasets. The results demonstrate that a straightforward AutoEncoder framework can yield effective, controllable attribute edits with faster training and inference, broadening practical deployment of attribute-conditioned image editing across diverse domains. Future work includes improving latent-space disentanglement and extending StyleAE to other generative backbones beyond StyleGAN.

Abstract

Deep conditional generative models are excellent tools for creating high-quality images and editing their attributes. However, training modern generative models from scratch is very expensive and requires large computational resources. In this paper, we introduce StyleAutoEncoder (StyleAE), a lightweight AutoEncoder module, which works as a plugin for pre-trained generative models and allows for manipulating the requested attributes of images. The proposed method offers a cost-effective solution for training deep generative models with limited computational resources, making it a promising technique for a wide range of applications. We evaluate StyleAutoEncoder by combining it with StyleGAN, which is currently one of the top generative models. Our experiments demonstrate that StyleAutoEncoder is at least as effective in manipulating image attributes as the state-of-the-art algorithms based on invertible normalizing flows. However, it is simpler, faster, and gives more freedom in designing neural

StyleAutoEncoder for manipulating image attributes using pre-trained StyleGAN

TL;DR

StyleAE introduces a lightweight AutoEncoder plugin that attaches to a pre-trained StyleGAN to enable targeted attribute manipulation directly in the StyleGAN latent space , by learning a structured mapping to a latent where encodes labeled attributes and captures remaining information. The encoder extracts attributes from , while the decoder reconstructs to preserve image quality, trained with a dual loss that penalizes reconstruction error and deviations from target attributes, including a specialized term for binary attributes. Compared to flow-based approaches like StyleFlow and PluGeN, StyleAE is simpler and significantly more computationally efficient, yet achieves comparable attribute manipulation accuracy and superior preservation of other image features on FFHQ and AFHQv2 datasets. The results demonstrate that a straightforward AutoEncoder framework can yield effective, controllable attribute edits with faster training and inference, broadening practical deployment of attribute-conditioned image editing across diverse domains. Future work includes improving latent-space disentanglement and extending StyleAE to other generative backbones beyond StyleGAN.

Abstract

Deep conditional generative models are excellent tools for creating high-quality images and editing their attributes. However, training modern generative models from scratch is very expensive and requires large computational resources. In this paper, we introduce StyleAutoEncoder (StyleAE), a lightweight AutoEncoder module, which works as a plugin for pre-trained generative models and allows for manipulating the requested attributes of images. The proposed method offers a cost-effective solution for training deep generative models with limited computational resources, making it a promising technique for a wide range of applications. We evaluate StyleAutoEncoder by combining it with StyleGAN, which is currently one of the top generative models. Our experiments demonstrate that StyleAutoEncoder is at least as effective in manipulating image attributes as the state-of-the-art algorithms based on invertible normalizing flows. However, it is simpler, faster, and gives more freedom in designing neural
Paper Structure (22 sections, 4 equations, 4 figures, 5 tables)

This paper contains 22 sections, 4 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Single-attribute manipulation with StyleAE in StyleGAN's latent space.
  • Figure 2: Architecture design of StyleAE. StyleAE maps the style code $w$ of the pre-trained StyleGAN into a target space, where labelled attributes are modelled by individual coordinates.
  • Figure 3: Examples of attributes modification for all of the tested models on FFHQ dataset. One can observe that StyleAE correctly modifies the requested attributes and is less invasive to the remaining characteristics of the image than the competitive flow-based methods.
  • Figure 4: Attribute modification on a sample image generated from StyleGAN. The generated images by all models exhibit successful changes in the manipulated attributes while maintaining the overall coherence of the image. Our findings indicate that the performance of StyleAE method is comparable to state-of-the-art flow-based models in producing effective attribute manipulation.