Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks
Chuan Li, Michael Wand
TL;DR
The paper tackles the computational bottleneck of deep Markovian texture synthesis by introducing MGANs, a feed-forward generator that inverts fixed VGG feature maps via a precomputed strided convolutional decoder. Trained adversarially with a patch-based discriminator on Markovian activations, MGANs achieve real-time texture synthesis with high fidelity, enabling texture generation, style transfer, and video stylization without iterative optimization at run time. The approach drastically speeds up generation compared with prior MDANs while maintaining competitive quality, and it highlights the importance of patch-level statistics and stabilized hinge-based adversarial training. Limitations include dependence on pre-trained features (VGG19) and challenges with non-texture content, signaling opportunities to combine with semantic models and multi-scale structures for broader applicability.
Abstract
This paper proposes Markovian Generative Adversarial Networks (MGANs), a method for training generative neural networks for efficient texture synthesis. While deep neural network approaches have recently demonstrated remarkable results in terms of synthesis quality, they still come at considerable computational costs (minutes of run-time for low-res images). Our paper addresses this efficiency issue. Instead of a numerical deconvolution in previous work, we precompute a feed-forward, strided convolutional network that captures the feature statistics of Markovian patches and is able to directly generate outputs of arbitrary dimensions. Such network can directly decode brown noise to realistic texture, or photos to artistic paintings. With adversarial training, we obtain quality comparable to recent neural texture synthesis methods. As no optimization is required any longer at generation time, our run-time performance (0.25M pixel images at 25Hz) surpasses previous neural texture synthesizers by a significant margin (at least 500 times faster). We apply this idea to texture synthesis, style transfer, and video stylization.
