Table of Contents
Fetching ...

AesFA: An Aesthetic Feature-Aware Arbitrary Neural Style Transfer

Joonwoo Kwon, Sooyoung Kim, Yuewei Lin, Shinjae Yoo, Jiook Cha

TL;DR

A lightweight but effective model, AesFA---Aesthetic Feature-Aware NST, that outperforms recent NST methods in terms of stylization quality, but it also achieves faster inference and introduces a new aesthetic feature: contrastive loss.

Abstract

Neural style transfer (NST) has evolved significantly in recent years. Yet, despite its rapid progress and advancement, existing NST methods either struggle to transfer aesthetic information from a style effectively or suffer from high computational costs and inefficiencies in feature disentanglement due to using pre-trained models. This work proposes a lightweight but effective model, AesFA -- Aesthetic Feature-Aware NST. The primary idea is to decompose the image via its frequencies to better disentangle aesthetic styles from the reference image while training the entire model in an end-to-end manner to exclude pre-trained models at inference completely. To improve the network's ability to extract more distinct representations and further enhance the stylization quality, this work introduces a new aesthetic feature: contrastive loss. Extensive experiments and ablations show the approach not only outperforms recent NST methods in terms of stylization quality, but it also achieves faster inference. Codes are available at https://github.com/Sooyyoungg/AesFA.

AesFA: An Aesthetic Feature-Aware Arbitrary Neural Style Transfer

TL;DR

A lightweight but effective model, AesFA---Aesthetic Feature-Aware NST, that outperforms recent NST methods in terms of stylization quality, but it also achieves faster inference and introduces a new aesthetic feature: contrastive loss.

Abstract

Neural style transfer (NST) has evolved significantly in recent years. Yet, despite its rapid progress and advancement, existing NST methods either struggle to transfer aesthetic information from a style effectively or suffer from high computational costs and inefficiencies in feature disentanglement due to using pre-trained models. This work proposes a lightweight but effective model, AesFA -- Aesthetic Feature-Aware NST. The primary idea is to decompose the image via its frequencies to better disentangle aesthetic styles from the reference image while training the entire model in an end-to-end manner to exclude pre-trained models at inference completely. To improve the network's ability to extract more distinct representations and further enhance the stylization quality, this work introduces a new aesthetic feature: contrastive loss. Extensive experiments and ablations show the approach not only outperforms recent NST methods in terms of stylization quality, but it also achieves faster inference. Codes are available at https://github.com/Sooyyoungg/AesFA.
Paper Structure (12 sections, 4 equations, 9 figures, 1 table)

This paper contains 12 sections, 4 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Top: The Starry Night by Vincent Van Gogh. The styles have a strong correlation with spatial information, as evidenced by the presence of whirling patterns and expressionistic yellow stars in the "sky". Bottom: Compared with other NST methods, our method can faithfully transfer styles while ensuring the spatial information.
  • Figure 2: The entire AesFA architecture for aesthetic feature-aware NST. The blue and green arrows indicate the high- and low-frequency feature processes, respectively.
  • Figure 3: The detailed design of the Adaptive Octave Convolutions (AdaOct) used in AesFA.
  • Figure 4: Illustration of the aesthetic style contrastive loss in a toy example alongside the other training losses employed in AesFA.
  • Figure 5: Qualitative comparison with various NST algorithms in 256 pixel resolution. Each column shows the stylized images of different state-of-the-art models.
  • ...and 4 more figures