Table of Contents
Fetching ...

FAGStyle: Feature Augmentation on Geodesic Surface for Zero-shot Text-guided Diffusion Image Style Transfer

Yuexing Han, Liheng Ruan, Bing Wang

TL;DR

This work introduces FAGStyle, a zero-shot text-guided diffusion image style transfer method that demonstrates superior performance over existing methods, consistently achieving stylization that retains the semantic content of the source image.

Abstract

The goal of image style transfer is to render an image guided by a style reference while maintaining the original content. Existing image-guided methods rely on specific style reference images, restricting their wider application and potentially compromising result quality. As a flexible alternative, text-guided methods allow users to describe the desired style using text prompts. Despite their versatility, these methods often struggle with maintaining style consistency, reflecting the described style accurately, and preserving the content of the target image. To address these challenges, we introduce FAGStyle, a zero-shot text-guided diffusion image style transfer method. Our approach enhances inter-patch information interaction by incorporating the Sliding Window Crop technique and Feature Augmentation on Geodesic Surface into our style control loss. Furthermore, we integrate a Pre-Shape self-correlation consistency loss to ensure content consistency. FAGStyle demonstrates superior performance over existing methods, consistently achieving stylization that retains the semantic content of the source image. Experimental results confirms the efficacy of FAGStyle across a diverse range of source contents and styles, both imagined and common.

FAGStyle: Feature Augmentation on Geodesic Surface for Zero-shot Text-guided Diffusion Image Style Transfer

TL;DR

This work introduces FAGStyle, a zero-shot text-guided diffusion image style transfer method that demonstrates superior performance over existing methods, consistently achieving stylization that retains the semantic content of the source image.

Abstract

The goal of image style transfer is to render an image guided by a style reference while maintaining the original content. Existing image-guided methods rely on specific style reference images, restricting their wider application and potentially compromising result quality. As a flexible alternative, text-guided methods allow users to describe the desired style using text prompts. Despite their versatility, these methods often struggle with maintaining style consistency, reflecting the described style accurately, and preserving the content of the target image. To address these challenges, we introduce FAGStyle, a zero-shot text-guided diffusion image style transfer method. Our approach enhances inter-patch information interaction by incorporating the Sliding Window Crop technique and Feature Augmentation on Geodesic Surface into our style control loss. Furthermore, we integrate a Pre-Shape self-correlation consistency loss to ensure content consistency. FAGStyle demonstrates superior performance over existing methods, consistently achieving stylization that retains the semantic content of the source image. Experimental results confirms the efficacy of FAGStyle across a diverse range of source contents and styles, both imagined and common.
Paper Structure (24 sections, 23 equations, 7 figures, 4 tables)

This paper contains 24 sections, 23 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: The flowchart of our proposed FAGStyle. To better guide the diffusion model for style transfer, we improve the style control loss and content control loss by employing the Sliding Window Crop (SWC) and Feature Augmentation on Geodesic Surface (FAGS) strategies. The gradients of these improved losses is added to the denoised image at each time step during the inference of the diffusion model to guide the style transfer process.
  • Figure 2: Text-guided Image Style transfer results using FAGStyle.
  • Figure 3: Qualitative comparison of our proposed method with other text-guided style transfer methods in imagined styles.
  • Figure 4: Qualitative comparison of our proposed method with other style transfer methods in some of the common styles.
  • Figure 5: The Qualitative ablation study of the proposed improvements in FAGStyle regarding style control loss and content control loss.
  • ...and 2 more figures