Table of Contents
Fetching ...

Opt-In Art: Learning Art Styles Only from Few Examples

Hui Ren, Joanna Materzynska, Rohit Gandikota, David Bau, Antonio Torralba

TL;DR

The paper investigates whether artistic styles can be learned without pretraining on paintings by training a photograph-only diffusion model (Blank Canvas Diffusion) and adapting it to styles via a LoRA-based Art Style Adapter trained on a handful of artworks. It demonstrates that with careful loss design and prompt conditioning, the adapted model can produce images matching the style of real artworks, achieving results comparable to models trained on large art datasets according to both automatic metrics and human judgments. Data-attribution analyses reveal that both the art-filtered Blank Canvas data and the small art exemplars contribute to the generated styles, highlighting the role of real-world imagery in shaping artistic outputs. The work raises important discussions about copyright, consent, and regulation in AI-generated art while offering a practical opt-in pathway for style generation with minimal training data.

Abstract

We explore whether pre-training on datasets with paintings is necessary for a model to learn an artistic style with only a few examples. To investigate this, we train a text-to-image model exclusively on photographs, without access to any painting-related content. We show that it is possible to adapt a model that is trained without paintings to an artistic style, given only few examples. User studies and automatic evaluations confirm that our model (post-adaptation) performs on par with state-of-the-art models trained on massive datasets that contain artistic content like paintings, drawings or illustrations. Finally, using data attribution techniques, we analyze how both artistic and non-artistic datasets contribute to generating artistic-style images. Surprisingly, our findings suggest that high-quality artistic outputs can be achieved without prior exposure to artistic data, indicating that artistic style generation can occur in a controlled, opt-in manner using only a limited, carefully selected set of training examples.

Opt-In Art: Learning Art Styles Only from Few Examples

TL;DR

The paper investigates whether artistic styles can be learned without pretraining on paintings by training a photograph-only diffusion model (Blank Canvas Diffusion) and adapting it to styles via a LoRA-based Art Style Adapter trained on a handful of artworks. It demonstrates that with careful loss design and prompt conditioning, the adapted model can produce images matching the style of real artworks, achieving results comparable to models trained on large art datasets according to both automatic metrics and human judgments. Data-attribution analyses reveal that both the art-filtered Blank Canvas data and the small art exemplars contribute to the generated styles, highlighting the role of real-world imagery in shaping artistic outputs. The work raises important discussions about copyright, consent, and regulation in AI-generated art while offering a practical opt-in pathway for style generation with minimal training data.

Abstract

We explore whether pre-training on datasets with paintings is necessary for a model to learn an artistic style with only a few examples. To investigate this, we train a text-to-image model exclusively on photographs, without access to any painting-related content. We show that it is possible to adapt a model that is trained without paintings to an artistic style, given only few examples. User studies and automatic evaluations confirm that our model (post-adaptation) performs on par with state-of-the-art models trained on massive datasets that contain artistic content like paintings, drawings or illustrations. Finally, using data attribution techniques, we analyze how both artistic and non-artistic datasets contribute to generating artistic-style images. Surprisingly, our findings suggest that high-quality artistic outputs can be achieved without prior exposure to artistic data, indicating that artistic style generation can occur in a controlled, opt-in manner using only a limited, carefully selected set of training examples.

Paper Structure

This paper contains 25 sections, 4 equations, 37 figures, 5 tables.

Figures (37)

  • Figure 1: (a) We introduce Blank Canvas Diffusion, a carefully curated text-to-image model trained only on photographs, serving as the pretraining foundation for our model. Our study explores whether a model with no prior exposure to paintings can learn artistic styles using (b) a LoRA Art Adapter trained on a small opt-in sample of an artist's work. (c) We find that it is possible to adapt a model that is trained without paintings to generalize an artistic style, given only few examples.
  • Figure 2: Examples of images included and excluded from the Blank Canvas dataset. The dataset is curated to remove paintings as well as artistic categories related to paintings, such as drawings and fine art. Examples of images that are close to the removal threshold are shown in columns b) and c).
  • Figure 3: Our Blank Canvas Diffusion model shows limited style transfer with training-free methods, suggesting that traditional models may rely on inherent stylistic biases. Unlike our model, traditional models have been trained on paintings, drawings, illustrations or other forms of digital art, enabling them to internalize stylistic patterns for effective style transfer.
  • Figure 4: Our model has no prior knowledge of paintings. It not only fails to generate the artwork indicated by the prompts, but its outputs also lack any apparent stylistic elements.
  • Figure 5: The generated image should match the style of a small exemplar dataset when prompted with a caption $C^*$, which includes a style prefix $V^*$. For example, if $C^* = \textit{People walking along a riverside path with colorful trees in the style of } V^*$, the image should reflect both the scene (content) and the specified artistic style. Content loss ensures that the visual elements of the prompt $C = \textit{People walking along a riverside path with colorful trees}$ are accurate, while style loss maintains the style associated with $V^*$.
  • ...and 32 more figures