Table of Contents
Fetching ...

Universal Photorealistic Style Transfer: A Lightweight and Adaptive Approach

Rong Liu, Enyu Zhao, Zhiyuan Liu, Andrew Feng, Scott John Easley

TL;DR

This work proposes a Universal Photorealistic Style Transfer (UPST) framework that delivers accurate photorealistic style transfer on high-resolution images and videos without relying on pre-training, and incorporates a lightweight StyleNet for per-instance transfer.

Abstract

Photorealistic style transfer aims to apply stylization while preserving the realism and structure of input content. However, existing methods often encounter challenges such as color tone distortions, dependency on pair-wise pre-training, inefficiency with high-resolution inputs, and the need for additional constraints in video style transfer tasks. To address these issues, we propose a Universal Photorealistic Style Transfer (UPST) framework that delivers accurate photorealistic style transfer on high-resolution images and videos without relying on pre-training. Our approach incorporates a lightweight StyleNet for per-instance transfer, ensuring color tone accuracy while supporting high-resolution inputs, maintaining rapid processing speeds, and eliminating the need for pretraining. To further enhance photorealism and efficiency, we introduce instance-adaptive optimization, which features an adaptive coefficient to prioritize content image realism and employs early stopping to accelerate network convergence. Additionally, UPST enables seamless video style transfer without additional constraints due to its strong non-color information preservation ability. Experimental results show that UPST consistently produces photorealistic outputs and significantly reduces GPU memory usage, making it an effective and universal solution for various photorealistic style transfer tasks.

Universal Photorealistic Style Transfer: A Lightweight and Adaptive Approach

TL;DR

This work proposes a Universal Photorealistic Style Transfer (UPST) framework that delivers accurate photorealistic style transfer on high-resolution images and videos without relying on pre-training, and incorporates a lightweight StyleNet for per-instance transfer.

Abstract

Photorealistic style transfer aims to apply stylization while preserving the realism and structure of input content. However, existing methods often encounter challenges such as color tone distortions, dependency on pair-wise pre-training, inefficiency with high-resolution inputs, and the need for additional constraints in video style transfer tasks. To address these issues, we propose a Universal Photorealistic Style Transfer (UPST) framework that delivers accurate photorealistic style transfer on high-resolution images and videos without relying on pre-training. Our approach incorporates a lightweight StyleNet for per-instance transfer, ensuring color tone accuracy while supporting high-resolution inputs, maintaining rapid processing speeds, and eliminating the need for pretraining. To further enhance photorealism and efficiency, we introduce instance-adaptive optimization, which features an adaptive coefficient to prioritize content image realism and employs early stopping to accelerate network convergence. Additionally, UPST enables seamless video style transfer without additional constraints due to its strong non-color information preservation ability. Experimental results show that UPST consistently produces photorealistic outputs and significantly reduces GPU memory usage, making it an effective and universal solution for various photorealistic style transfer tasks.
Paper Structure (12 sections, 11 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 12 sections, 11 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Overview of Universal Photorealistic Style Transfer (UPST) Pipeline and Instance-Adaptive Optimization.
  • Figure 2: Architecture of Lightweight StyleNet.
  • Figure 3: Qualitative Comparison. Given (a) an input pair consisting of content (top) and style (bottom), the results produced by (b) Ours, (c) $\text{PhotoWCT}^2$Chiu_2022_WACV, (d) $\text{WCT}^2$yoo2019photorealistic, and (e) Neural Preset ke2023neural are presented. Our approach outperforms others in prioritizing content photorealism and transferring desired style effects. Note that IPSI achieves these results without pre-training on annotated datasets.
  • Figure 4: Multi-Frame Applications.
  • Figure 5: Qualitative Comparison of Ablation Studies.