Stacked Generative Adversarial Networks

Xun Huang; Yixuan Li; Omid Poursaeed; John Hopcroft; Serge Belongie

Stacked Generative Adversarial Networks

Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, Serge Belongie

TL;DR

SGAN introduces a top-down stack of GANs that inverts a pre-trained discriminative encoder by enforcing adversarial alignment of intermediate representations through representation discriminators. It adds a conditional loss to preserve higher-level conditioning and an entropy loss to promote diverse outputs via a variational lower bound on H(hat{h}_i|h_{i+1}). Training proceeds from independent per-stack objectives to end-to-end joint optimization, enabling hierarchical decomposition of variation and conditioning on class labels. On MNIST, SVHN, and CIFAR-10, SGAN achieves higher image quality and diversity than vanilla GAN variants, with state-of-the-art Inception scores on CIFAR-10 and strong human-perceived realism in Visual Turing Tests. The work demonstrates that leveraging hierarchical discriminative representations can substantially improve generative modeling while enhancing interpretability through multi-level latent structure.

Abstract

In this paper, we propose a novel generative model named Stacked Generative Adversarial Networks (SGAN), which is trained to invert the hierarchical representations of a bottom-up discriminative network. Our model consists of a top-down stack of GANs, each learned to generate lower-level representations conditioned on higher-level representations. A representation discriminator is introduced at each feature hierarchy to encourage the representation manifold of the generator to align with that of the bottom-up discriminative network, leveraging the powerful discriminative representations to guide the generative model. In addition, we introduce a conditional loss that encourages the use of conditional information from the layer above, and a novel entropy loss that maximizes a variational lower bound on the conditional entropy of generator outputs. We first train each stack independently, and then train the whole model end-to-end. Unlike the original GAN that uses a single noise vector to represent all the variations, our SGAN decomposes variations into multiple levels and gradually resolves uncertainties in the top-down generative process. Based on visual inspection, Inception scores and visual Turing test, we demonstrate that SGAN is able to generate images of much higher quality than GANs without stacking.

Stacked Generative Adversarial Networks

TL;DR

Abstract

Stacked Generative Adversarial Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)