DyArtbank: Diverse Artistic Style Transfer via Pre-trained Stable Diffusion and Dynamic Style Prompt Artbank
Zhanjie Zhang, Quanwei Zhang, Guangyuan Li, Junsheng Luan, Mengyuan Yang, Yun Wang, Lei Zhao
TL;DR
DyArtbank introduces Dynamic Style Prompt Artbank (DSPA) and Key Content Feature Prompt (KCFP) to enable diverse and highly realistic artistic style transfer using a pre-trained Stable Diffusion model. DSPA learns a distributed style representation from multiple artworks and enables sampling via the reparameterization trick to drive stylistic diversity, while KCFP preserves content structure through a fine-tuned ControlNet and a VAE encoder. The two-stage training (DSPA first, then KCFP) and orthogonality regularization promote both diversity and content fidelity, with extensive qualitative and quantitative results showing superiority over CAST and existing DAST methods. This framework also enables data-augmentation-like sampling of style prompts, offering practical benefits for generating varied stylizations and for downstream tasks. Overall, DyArtbank demonstrates that large-scale pre-trained diffusion models can be effectively steered to achieve diverse, realistic artistic stylizations with strong content preservation.
Abstract
Artistic style transfer aims to transfer the learned style onto an arbitrary content image. However, most existing style transfer methods can only render consistent artistic stylized images, making it difficult for users to get enough stylized images to enjoy. To solve this issue, we propose a novel artistic style transfer framework called DyArtbank, which can generate diverse and highly realistic artistic stylized images. Specifically, we introduce a Dynamic Style Prompt ArtBank (DSPA), a set of learnable parameters. It can learn and store the style information from the collection of artworks, dynamically guiding pre-trained stable diffusion to generate diverse and highly realistic artistic stylized images. DSPA can also generate random artistic image samples with the learned style information, providing a new idea for data augmentation. Besides, a Key Content Feature Prompt (KCFP) module is proposed to provide sufficient content prompts for pre-trained stable diffusion to preserve the detailed structure of the input content image. Extensive qualitative and quantitative experiments verify the effectiveness of our proposed method. Code is available: https://github.com/Jamie-Cheung/DyArtbank
