Table of Contents
Fetching ...

Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks

Chuan Li, Michael Wand

TL;DR

The paper tackles the computational bottleneck of deep Markovian texture synthesis by introducing MGANs, a feed-forward generator that inverts fixed VGG feature maps via a precomputed strided convolutional decoder. Trained adversarially with a patch-based discriminator on Markovian activations, MGANs achieve real-time texture synthesis with high fidelity, enabling texture generation, style transfer, and video stylization without iterative optimization at run time. The approach drastically speeds up generation compared with prior MDANs while maintaining competitive quality, and it highlights the importance of patch-level statistics and stabilized hinge-based adversarial training. Limitations include dependence on pre-trained features (VGG19) and challenges with non-texture content, signaling opportunities to combine with semantic models and multi-scale structures for broader applicability.

Abstract

This paper proposes Markovian Generative Adversarial Networks (MGANs), a method for training generative neural networks for efficient texture synthesis. While deep neural network approaches have recently demonstrated remarkable results in terms of synthesis quality, they still come at considerable computational costs (minutes of run-time for low-res images). Our paper addresses this efficiency issue. Instead of a numerical deconvolution in previous work, we precompute a feed-forward, strided convolutional network that captures the feature statistics of Markovian patches and is able to directly generate outputs of arbitrary dimensions. Such network can directly decode brown noise to realistic texture, or photos to artistic paintings. With adversarial training, we obtain quality comparable to recent neural texture synthesis methods. As no optimization is required any longer at generation time, our run-time performance (0.25M pixel images at 25Hz) surpasses previous neural texture synthesizers by a significant margin (at least 500 times faster). We apply this idea to texture synthesis, style transfer, and video stylization.

Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks

TL;DR

The paper tackles the computational bottleneck of deep Markovian texture synthesis by introducing MGANs, a feed-forward generator that inverts fixed VGG feature maps via a precomputed strided convolutional decoder. Trained adversarially with a patch-based discriminator on Markovian activations, MGANs achieve real-time texture synthesis with high fidelity, enabling texture generation, style transfer, and video stylization without iterative optimization at run time. The approach drastically speeds up generation compared with prior MDANs while maintaining competitive quality, and it highlights the importance of patch-level statistics and stabilized hinge-based adversarial training. Limitations include dependence on pre-trained features (VGG19) and challenges with non-texture content, signaling opportunities to combine with semantic models and multi-scale structures for broader applicability.

Abstract

This paper proposes Markovian Generative Adversarial Networks (MGANs), a method for training generative neural networks for efficient texture synthesis. While deep neural network approaches have recently demonstrated remarkable results in terms of synthesis quality, they still come at considerable computational costs (minutes of run-time for low-res images). Our paper addresses this efficiency issue. Instead of a numerical deconvolution in previous work, we precompute a feed-forward, strided convolutional network that captures the feature statistics of Markovian patches and is able to directly generate outputs of arbitrary dimensions. Such network can directly decode brown noise to realistic texture, or photos to artistic paintings. With adversarial training, we obtain quality comparable to recent neural texture synthesis methods. As no optimization is required any longer at generation time, our run-time performance (0.25M pixel images at 25Hz) surpasses previous neural texture synthesizers by a significant margin (at least 500 times faster). We apply this idea to texture synthesis, style transfer, and video stylization.

Paper Structure

This paper contains 10 sections, 2 equations, 15 figures.

Figures (15)

  • Figure 1: Motivation: real world data does not always comply with a Gaussian distribution (first), but a complex nonlinear manifold (second). We adversarially learn a mapping to project contextually related patches to that manifold.
  • Figure 2: Our model contains a generative network (blue blocks) and a discriminative network (green blocks). We apply the discriminative training on Markovian neural patches (purple block as the input of the discriminative network.).
  • Figure 3: Un-guided texture synthesis using MDANs. For each case the first image is the example texture, and the other two are the synthesis results. Image credits: Xie16's "Ivy", flickr user erwin brevis's "güell", Katsushika Hokusai's "The Great Wave off Kanagawa", Kandinsky's "Composition VII".
  • Figure 4: Guided texture synthesis using MDANs. The reference textures are the same as in Figure \ref{['fig:MDAN_results_UnGuided']}.
  • Figure 5: Our MGANs learn a mapping from VGG_19 encoding of the input photo to the stylized example (MDANs). The reference style texture for MDANs is Pablo Picasso's "self portrait 1907". We compare the results of MGANs to Pixel VAE and Neural VAE in with both training and testing data.
  • ...and 10 more figures