Table of Contents
Fetching ...

Visualize and Paint GAN Activations

Rudolf Herdt, Peter Maass

TL;DR

The paper addresses understanding GAN internal representations by visualizing activation vectors $v$ from hidden layers and demonstrates painting with activation vectors to control generated outputs, enabling segmentation-focused data generation without labeled training data. It introduces tileable versus non-tileable activation concepts and presents two visualization schemes—Full Replace and Grid Replace—to accommodate different structures, plus a painting workflow that maps colors to activation vectors for structured edits. Across StyleGAN2 models (AFHQ Wild, BreCaHAD, LSUN Church) and a digipath GAN, the study shows tileable vectors yield coherent textures while non-tileable vectors require grid-based painting, offering a practical route to annotated data for semantic segmentation. The findings advance interpretability of GANs, provide concrete tools for targeted image editing and data augmentation in medical imaging, and are potentially extensible to other generative models such as VAEs, with the caveat of conditioning inputs influencing activation-vector semantics.$v$, $LayerX$.

Abstract

We investigate how generated structures of GANs correlate with their activations in hidden layers, with the purpose of better understanding the inner workings of those models and being able to paint structures with unconditionally trained GANs. This gives us more control over the generated images, allowing to generate them from a semantic segmentation map while not requiring such a segmentation in the training data. To this end we introduce the concept of tileable features, allowing us to identify activations that work well for painting.

Visualize and Paint GAN Activations

TL;DR

The paper addresses understanding GAN internal representations by visualizing activation vectors from hidden layers and demonstrates painting with activation vectors to control generated outputs, enabling segmentation-focused data generation without labeled training data. It introduces tileable versus non-tileable activation concepts and presents two visualization schemes—Full Replace and Grid Replace—to accommodate different structures, plus a painting workflow that maps colors to activation vectors for structured edits. Across StyleGAN2 models (AFHQ Wild, BreCaHAD, LSUN Church) and a digipath GAN, the study shows tileable vectors yield coherent textures while non-tileable vectors require grid-based painting, offering a practical route to annotated data for semantic segmentation. The findings advance interpretability of GANs, provide concrete tools for targeted image editing and data augmentation in medical imaging, and are potentially extensible to other generative models such as VAEs, with the caveat of conditioning inputs influencing activation-vector semantics., .

Abstract

We investigate how generated structures of GANs correlate with their activations in hidden layers, with the purpose of better understanding the inner workings of those models and being able to paint structures with unconditionally trained GANs. This gives us more control over the generated images, allowing to generate them from a semantic segmentation map while not requiring such a segmentation in the training data. To this end we introduce the concept of tileable features, allowing us to identify activations that work well for painting.
Paper Structure (21 sections, 9 figures, 1 table, 1 algorithm)

This paper contains 21 sections, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: Overview of our visualization method
  • Figure 2: Example grid masks. From left to right is: Using a grid size of 1; Using a grid size of 2; Fully replicating the activation vector
  • Figure 3: Example painting
  • Figure 4: Using perceptual loss from a feature extractor to help guide the training of the digipath GAN
  • Figure 5: 4 visualizations with the highest cosine similarity, i.e. tileable features
  • ...and 4 more figures