Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks
Alex J. Champandard
TL;DR
The paper tackles unpredictability in CNN-based style transfer by introducing semantic annotations that guide generation. It presents an augmented CNN architecture that injects semantic maps into feature representations, enabling content-aware style transfer and doodle-to-paintings. The approach preserves compatibility with patch-based methods and demonstrates improved control, fewer artifacts, and broader applicability, bridging image segmentation advances with image synthesis. This work offers practical tools for producing coherent, semantically consistent stylizations in portraits, landscapes, and beyond.
Abstract
Convolutional neural networks (CNNs) have proven highly effective at image synthesis and style transfer. For most users, however, using them as tools can be a challenging task due to their unpredictable behavior that goes against common intuitions. This paper introduces a novel concept to augment such generative architectures with semantic annotations, either by manually authoring pixel labels or using existing solutions for semantic segmentation. The result is a content-aware generative algorithm that offers meaningful control over the outcome. Thus, we increase the quality of images generated by avoiding common glitches, make the results look significantly more plausible, and extend the functional range of these algorithms---whether for portraits or landscapes, etc. Applications include semantic style transfer and turning doodles with few colors into masterful paintings!
