LouvreSAE: Sparse Autoencoders for Interpretable and Controllable Style Transfer
Raina Panda, Daniel Fein, Arpita Singhal, Mark Fiore, Maneesh Agrawala, Matyas Bohacek
TL;DR
This work defines an operational notion of artistic style and introduces LouvreSAE, an art-specific Sparse Autoencoder trained on CLIP embeddings to extract a sparse, interpretable dictionary of style concepts. Style profiles are constructed from a handful of reference artworks and steer generative models in latent space without fine-tuning, enabling lightweight, interpretable, and composable style transfer. Evaluations on ArtBench10 show LouvreSAE achieves strong style fidelity while delivering 1.7–20x speedups over baselines, and qualitative results demonstrate clear, human-understandable control over stylistic elements. The approach emphasizes interpretability through a taxonomy of concepts and an autointerpretability pipeline, offering a practical path toward fine-grained, user-controlled style manipulation in generative systems.
Abstract
Artistic style transfer in generative models remains a significant challenge, as existing methods often introduce style only via model fine-tuning, additional adapters, or prompt engineering, all of which can be computationally expensive and may still entangle style with subject matter. In this paper, we introduce a training- and inference-light, interpretable method for representing and transferring artistic style. Our approach leverages an art-specific Sparse Autoencoder (SAE) on top of latent embeddings of generative image models. Trained on artistic data, our SAE learns an emergent, largely disentangled set of stylistic and compositional concepts, corresponding to style-related elements pertaining brushwork, texture, and color palette, as well as semantic and structural concepts. We call it LouvreSAE and use it to construct style profiles: compact, decomposable steering vectors that enable style transfer without any model updates or optimization. Unlike prior concept-based style transfer methods, our method requires no fine-tuning, no LoRA training, and no additional inference passes, enabling direct steering of artistic styles from only a few reference images. We validate our method on ArtBench10, achieving or surpassing existing methods on style evaluations (VGG Style Loss and CLIP Score Style) while being 1.7-20x faster and, critically, interpretable.
