Table of Contents
Fetching ...

DyArtbank: Diverse Artistic Style Transfer via Pre-trained Stable Diffusion and Dynamic Style Prompt Artbank

Zhanjie Zhang, Quanwei Zhang, Guangyuan Li, Junsheng Luan, Mengyuan Yang, Yun Wang, Lei Zhao

TL;DR

DyArtbank introduces Dynamic Style Prompt Artbank (DSPA) and Key Content Feature Prompt (KCFP) to enable diverse and highly realistic artistic style transfer using a pre-trained Stable Diffusion model. DSPA learns a distributed style representation from multiple artworks and enables sampling via the reparameterization trick to drive stylistic diversity, while KCFP preserves content structure through a fine-tuned ControlNet and a VAE encoder. The two-stage training (DSPA first, then KCFP) and orthogonality regularization promote both diversity and content fidelity, with extensive qualitative and quantitative results showing superiority over CAST and existing DAST methods. This framework also enables data-augmentation-like sampling of style prompts, offering practical benefits for generating varied stylizations and for downstream tasks. Overall, DyArtbank demonstrates that large-scale pre-trained diffusion models can be effectively steered to achieve diverse, realistic artistic stylizations with strong content preservation.

Abstract

Artistic style transfer aims to transfer the learned style onto an arbitrary content image. However, most existing style transfer methods can only render consistent artistic stylized images, making it difficult for users to get enough stylized images to enjoy. To solve this issue, we propose a novel artistic style transfer framework called DyArtbank, which can generate diverse and highly realistic artistic stylized images. Specifically, we introduce a Dynamic Style Prompt ArtBank (DSPA), a set of learnable parameters. It can learn and store the style information from the collection of artworks, dynamically guiding pre-trained stable diffusion to generate diverse and highly realistic artistic stylized images. DSPA can also generate random artistic image samples with the learned style information, providing a new idea for data augmentation. Besides, a Key Content Feature Prompt (KCFP) module is proposed to provide sufficient content prompts for pre-trained stable diffusion to preserve the detailed structure of the input content image. Extensive qualitative and quantitative experiments verify the effectiveness of our proposed method. Code is available: https://github.com/Jamie-Cheung/DyArtbank

DyArtbank: Diverse Artistic Style Transfer via Pre-trained Stable Diffusion and Dynamic Style Prompt Artbank

TL;DR

DyArtbank introduces Dynamic Style Prompt Artbank (DSPA) and Key Content Feature Prompt (KCFP) to enable diverse and highly realistic artistic style transfer using a pre-trained Stable Diffusion model. DSPA learns a distributed style representation from multiple artworks and enables sampling via the reparameterization trick to drive stylistic diversity, while KCFP preserves content structure through a fine-tuned ControlNet and a VAE encoder. The two-stage training (DSPA first, then KCFP) and orthogonality regularization promote both diversity and content fidelity, with extensive qualitative and quantitative results showing superiority over CAST and existing DAST methods. This framework also enables data-augmentation-like sampling of style prompts, offering practical benefits for generating varied stylizations and for downstream tasks. Overall, DyArtbank demonstrates that large-scale pre-trained diffusion models can be effectively steered to achieve diverse, realistic artistic stylizations with strong content preservation.

Abstract

Artistic style transfer aims to transfer the learned style onto an arbitrary content image. However, most existing style transfer methods can only render consistent artistic stylized images, making it difficult for users to get enough stylized images to enjoy. To solve this issue, we propose a novel artistic style transfer framework called DyArtbank, which can generate diverse and highly realistic artistic stylized images. Specifically, we introduce a Dynamic Style Prompt ArtBank (DSPA), a set of learnable parameters. It can learn and store the style information from the collection of artworks, dynamically guiding pre-trained stable diffusion to generate diverse and highly realistic artistic stylized images. DSPA can also generate random artistic image samples with the learned style information, providing a new idea for data augmentation. Besides, a Key Content Feature Prompt (KCFP) module is proposed to provide sufficient content prompts for pre-trained stable diffusion to preserve the detailed structure of the input content image. Extensive qualitative and quantitative experiments verify the effectiveness of our proposed method. Code is available: https://github.com/Jamie-Cheung/DyArtbank

Paper Structure

This paper contains 20 sections, 8 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Artistic stylized examples. The $1^{st}$ and $2^{nd}$ columns show the input content and style images. The $3^{rd}$ and $4^{th}$ columns show the stylized images synthesized by consistent style transfer methods (e.g., ArtBank zhang2024artbank). Despite performing multiple inferences, ArtBank only obtained consistent artistic stylized images (i.e., the same content structure and style appearance). The other columns show the stylized images with the same content structure and different style appearance generated by diverse style transfer methods (e.g., Eps-AM cheng2023user, our proposed DyArtbank). Note: the random seeds are fixed for all methods.
  • Figure 2: The pipeline of training DyArtbank. In stage one, we learn a Dynamic Style Prompt Artbank (DSPA) to learn and store the style information from the collection of artworks. In stage two, we learn a Key Content Feature Prompt (KCFP) module to learn content prompts from the photographs.
  • Figure 3: The pipeline of artistic stylized image generation using DyArtbank. In inference, a new parameter is sampled from DSPA, which can dynamically guide the pre-trained stable diffusion to generate diverse artistic stylized images. Besides, to preserve the detailed structure of the input content image, the KCFP provides sufficient content prompts for pretrained stable diffusion.
  • Figure 4: Qualitative comparison. Compared to the previous version (Artbank) zhang2024artbank, the DyArtbank can preserve the content images' structure better.
  • Figure 5: Ablation study of KCFP.
  • ...and 5 more figures