SigStyle: Signature Style Transfer via Personalized Text-to-Image Models
Ye Wang, Tongyuan Bai, Xuping Xie, Zili Yi, Yilin Wang, Rui Ma
TL;DR
SigStyle tackles signature style transfer from a single reference by learning a dedicated style representation through a hypernetwork that fine-tunes only decoder-attention weights in a personalized diffusion framework. It represents the style as a token (*) and preserves content by performing DDIM inversion on the content image and injecting content-attention priors during the first $k$ denoising steps. The approach yields high-quality global and local transfers, supports texture transfer and style fusion, and enables style-guided text-to-image generation, outperforming several state-of-the-art baselines in both qualitative and quantitative assessments. Overall, SigStyle offers an effective, single-image, parameter-efficient pathway for explicit, controllable preservation of signature-style attributes in diffusion-based synthesis, with potential for broader deployment and more controllable prompts.
Abstract
Style transfer enables the seamless integration of artistic styles from a style image into a content image, resulting in visually striking and aesthetically enriched outputs. Despite numerous advances in this field, existing methods did not explicitly focus on the signature style, which represents the distinct and recognizable visual traits of the image such as geometric and structural patterns, color palettes and brush strokes etc. In this paper, we introduce SigStyle, a framework that leverages the semantic priors that embedded in a personalized text-to-image diffusion model to capture the signature style representation. This style capture process is powered by a hypernetwork that efficiently fine-tunes the diffusion model for any given single style image. Style transfer then is conceptualized as the reconstruction process of content image through learned style tokens from the personalized diffusion model. Additionally, to ensure the content consistency throughout the style transfer process, we introduce a time-aware attention swapping technique that incorporates content information from the original image into the early denoising steps of target image generation. Beyond enabling high-quality signature style transfer across a wide range of styles, SigStyle supports multiple interesting applications, such as local style transfer, texture transfer, style fusion and style-guided text-to-image generation. Quantitative and qualitative evaluations demonstrate our approach outperforms existing style transfer methods for recognizing and transferring the signature styles.
