Table of Contents
Fetching ...

ProcessPainter: Learn Painting Process from Sequence Data

Yiren Song, Shijie Huang, Chen Yao, Xiaojun Ye, Hai Ci, Jiaming Liu, Yuxuan Zhang, Mike Zheng Shou

TL;DR

ProcessPainter reframes painting as a video-generation task to reproduce authentic artist painting processes from text prompts or reference images. It combines synthetic pretraining with artist-specific LoRA fine-tuning and introduces an Artwork Replication Network for controllable sequence generation, including conversion of artworks to process key-frames and completion of semi-finished pieces. The approach achieves more human-like, consistent painting sequences than stroke-based baselines, with quantitative gains in reconstruction metrics and positive user perceptions. This work offers practical tools for art education and creative image generation that illuminate the step-by-step 'how' of painting.

Abstract

The painting process of artists is inherently stepwise and varies significantly among different painters and styles. Generating detailed, step-by-step painting processes is essential for art education and research, yet remains largely underexplored. Traditional stroke-based rendering methods break down images into sequences of brushstrokes, yet they fall short of replicating the authentic processes of artists, with limitations confined to basic brushstroke modifications. Text-to-image models utilizing diffusion processes generate images through iterative denoising, also diverge substantially from artists' painting process. To address these challenges, we introduce ProcessPainter, a text-to-video model that is initially pre-trained on synthetic data and subsequently fine-tuned with a select set of artists' painting sequences using the LoRA model. This approach successfully generates painting processes from text prompts for the first time. Furthermore, we introduce an Artwork Replication Network capable of accepting arbitrary-frame input, which facilitates the controlled generation of painting processes, decomposing images into painting sequences, and completing semi-finished artworks. This paper offers new perspectives and tools for advancing art education and image generation technology.

ProcessPainter: Learn Painting Process from Sequence Data

TL;DR

ProcessPainter reframes painting as a video-generation task to reproduce authentic artist painting processes from text prompts or reference images. It combines synthetic pretraining with artist-specific LoRA fine-tuning and introduces an Artwork Replication Network for controllable sequence generation, including conversion of artworks to process key-frames and completion of semi-finished pieces. The approach achieves more human-like, consistent painting sequences than stroke-based baselines, with quantitative gains in reconstruction metrics and positive user perceptions. This work offers practical tools for art education and creative image generation that illuminate the step-by-step 'how' of painting.

Abstract

The painting process of artists is inherently stepwise and varies significantly among different painters and styles. Generating detailed, step-by-step painting processes is essential for art education and research, yet remains largely underexplored. Traditional stroke-based rendering methods break down images into sequences of brushstrokes, yet they fall short of replicating the authentic processes of artists, with limitations confined to basic brushstroke modifications. Text-to-image models utilizing diffusion processes generate images through iterative denoising, also diverge substantially from artists' painting process. To address these challenges, we introduce ProcessPainter, a text-to-video model that is initially pre-trained on synthetic data and subsequently fine-tuned with a select set of artists' painting sequences using the LoRA model. This approach successfully generates painting processes from text prompts for the first time. Furthermore, we introduce an Artwork Replication Network capable of accepting arbitrary-frame input, which facilitates the controlled generation of painting processes, decomposing images into painting sequences, and completing semi-finished artworks. This paper offers new perspectives and tools for advancing art education and image generation technology.
Paper Structure (24 sections, 5 equations, 6 figures, 2 tables)

This paper contains 24 sections, 5 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Overall schematics of our method. During the training phase, we first pre-train the Painting model and Artwork Replication Network on 40,000 synthetic data points. Then, we fine-tune the Painting LoRA model on a small amount of artists' painting process data. During inference, ProcessPainter generates the painting process step by step from a reference image, producing the final painting as the last frame. It can also refine a partially completed image based on textual descriptions and the input image. When no reference image is provided, ProcessPainter generates the painting process solely from textual descriptions.
  • Figure 2: Text to painting processe generation results. ProcessPainter can learn different process style from synthetic data.
  • Figure 3: A Painting LoRA can be fine-tuned only on 10-50 sequences of artists' painting process, which can effectively capture the characteristics of the artists' painting process and the style of the final results.
  • Figure 4: Compare with stroke based rendering methods, our method provides a more precise reconstruction of the original images, and the painting process more closely resembles that of human artists. For different types of paintings, the strategies of the painting process can be controlled by switching the Painting Model or Painting LoRA.
  • Figure 5: Given a reference image, ProcessPainter can generate a painting process that ensures specific frames match the reference image exactly.
  • ...and 1 more figures