Table of Contents
Fetching ...

GIST: Towards Photorealistic Style Transfer via Multiscale Geometric Representations

Renan A. Rojas-Gomez, Minh N. Do

TL;DR

GIST addresses artifacts from discriminative-encoder-based photorealistic style transfer by introducing a training-free framework that uses multiscale geometric representations (Wavelets and Contourlets) and optimal transport to align content and style subbands. By operating under a Gaussian relaxation, it yields a closed-form $W_2$ transport map, enabling efficient, model-free style transfer without decoding networks. The approach replaces learned autoencoders with a multiresolution encoder/decoder, extends to artistic transfer via Edge Tangent Flow, and supports style interpolation, all while achieving competitive quality with substantially reduced computation. This makes photorealistic stylization practical for real-time applications and semantically aware editing without relying on large pretrained models.

Abstract

State-of-the-art Style Transfer methods often leverage pre-trained encoders optimized for discriminative tasks, which may not be ideal for image synthesis. This can result in significant artifacts and loss of photorealism. Motivated by the ability of multiscale geometric image representations to capture fine-grained details and global structure, we propose GIST: Geometric-based Image Style Transfer, a novel Style Transfer technique that exploits the geometric properties of content and style images. GIST replaces the standard Neural Style Transfer autoencoding framework with a multiscale image expansion, preserving scene details without the need for post-processing or training. Our method matches multiresolution and multidirectional representations such as Wavelets and Contourlets by solving an optimal transport problem, leading to an efficient texture transferring. Experiments show that GIST is on-par or outperforms recent photorealistic Style Transfer approaches while significantly reducing the processing time with no model training.

GIST: Towards Photorealistic Style Transfer via Multiscale Geometric Representations

TL;DR

GIST addresses artifacts from discriminative-encoder-based photorealistic style transfer by introducing a training-free framework that uses multiscale geometric representations (Wavelets and Contourlets) and optimal transport to align content and style subbands. By operating under a Gaussian relaxation, it yields a closed-form transport map, enabling efficient, model-free style transfer without decoding networks. The approach replaces learned autoencoders with a multiresolution encoder/decoder, extends to artistic transfer via Edge Tangent Flow, and supports style interpolation, all while achieving competitive quality with substantially reduced computation. This makes photorealistic stylization practical for real-time applications and semantically aware editing without relying on large pretrained models.

Abstract

State-of-the-art Style Transfer methods often leverage pre-trained encoders optimized for discriminative tasks, which may not be ideal for image synthesis. This can result in significant artifacts and loss of photorealism. Motivated by the ability of multiscale geometric image representations to capture fine-grained details and global structure, we propose GIST: Geometric-based Image Style Transfer, a novel Style Transfer technique that exploits the geometric properties of content and style images. GIST replaces the standard Neural Style Transfer autoencoding framework with a multiscale image expansion, preserving scene details without the need for post-processing or training. Our method matches multiresolution and multidirectional representations such as Wavelets and Contourlets by solving an optimal transport problem, leading to an efficient texture transferring. Experiments show that GIST is on-par or outperforms recent photorealistic Style Transfer approaches while significantly reducing the processing time with no model training.

Paper Structure

This paper contains 18 sections, 23 equations, 9 figures, 3 tables, 1 algorithm.

Figures (9)

  • Figure 1: Photorealistic Style Transfer via geometric image representations. We propose GIST, a Geometric-based Image Style Transfer technique that aligns multiscale representations such as Wavelets and Contourlets to efficiently transfer style from arbitrary images. Our method achieves improved or on-par performance to deep learning methods like WCT$^{2}$ in content and texture preservation without the need for training or extensive computations.
  • Figure 2: GIST: Style Transfer using multiscale geometric representations. To create a stylized image ${\bm{i}}_{\text{cs}}$, GIST progressively aligns the content from ${\bm{f}}_{c}$ subbands and the style from ${\bm{f}}_{s}$ subbands from coarse to fine resolution using an optimal transport map $t$. This ensures the preservation of content attributes while incorporating the perceptual properties of the style image. GIST can handle general geometric image representations such as Wavelets and Contourlets.
  • Figure 3: Enforcing artistic Style Transfer with subband fusion. We compute the Edge Tangent Flow of the style image ${\bm{i}}_{s}$ to extract its detail subbands $\{{\bm{g}}^{l}_{s,k>0}\}_{l=1}^{L}$. These are then fused with the corresponding content subbands from coarse to fine scale, promoting an artistic image appearance.
  • Figure 4: Interpolating style in the representation space. A convex combination of the content and style subbands enables a fine control over the stylization strength. For instance, given a single style reference ${\bm{i}}^{1}_{s}$ and blending factors $\bm{\lambda}=(\lambda_{0}\ \ 1-\lambda_{0})$, increasing the weight of the content reference $\lambda_{0}$ attenuates the Style Transfer effect.
  • Figure 5: Fine-grained Style Transfer via semantic labels. Our geometric representation approach allows for targeted stylization using semantic labels, providing control over the Style Transfer process at specific image regions at a fraction of the cost of deep learning methods without sacrificing photorealism.
  • ...and 4 more figures