Table of Contents
Fetching ...

Few-shot Calligraphy Style Learning

Fangda Chen, Jiacheng Nie, Lichuan Jiang, Zhuoer Zeng

TL;DR

This paper tackles the scarcity of data for replicating a specific calligraphy style by proposing Presidifussion, a two-stage diffusion framework that pretrains on a broad set of ancient calligraphy and fine-tunes on a small Xu-specific dataset. It introduces font image conditioning and a novel stroke information conditioning to enforce structural fidelity in generated characters, achieving competitive results with far less data than prior methods. The approach demonstrates notable data efficiency for cultural heritage digitization and digital preservation of calligraphic art, with SSIM-based evaluation supporting structural similarity to real Xu artworks. The work highlights practical potential for scalable, high-fidelity style transfer in historical handwriting and suggests avenues for further conditioning and augmentation to push accuracy even further.

Abstract

We introduced "Presidifussion," a novel approach to learning and replicating the unique style of calligraphy of President Xu, using a pretrained diffusion model adapted through a two-stage training process. Initially, our model is pretrained on a diverse dataset containing works from various calligraphers. This is followed by fine-tuning on a smaller, specialized dataset of President Xu's calligraphy, comprising just under 200 images. Our method introduces innovative techniques of font image conditioning and stroke information conditioning, enabling the model to capture the intricate structural elements of Chinese characters. The effectiveness of our approach is demonstrated through a comparison with traditional methods like zi2zi and CalliGAN, with our model achieving comparable performance using significantly smaller datasets and reduced computational resources. This work not only presents a breakthrough in the digital preservation of calligraphic art but also sets a new standard for data-efficient generative modeling in the domain of cultural heritage digitization.

Few-shot Calligraphy Style Learning

TL;DR

This paper tackles the scarcity of data for replicating a specific calligraphy style by proposing Presidifussion, a two-stage diffusion framework that pretrains on a broad set of ancient calligraphy and fine-tunes on a small Xu-specific dataset. It introduces font image conditioning and a novel stroke information conditioning to enforce structural fidelity in generated characters, achieving competitive results with far less data than prior methods. The approach demonstrates notable data efficiency for cultural heritage digitization and digital preservation of calligraphic art, with SSIM-based evaluation supporting structural similarity to real Xu artworks. The work highlights practical potential for scalable, high-fidelity style transfer in historical handwriting and suggests avenues for further conditioning and augmentation to push accuracy even further.

Abstract

We introduced "Presidifussion," a novel approach to learning and replicating the unique style of calligraphy of President Xu, using a pretrained diffusion model adapted through a two-stage training process. Initially, our model is pretrained on a diverse dataset containing works from various calligraphers. This is followed by fine-tuning on a smaller, specialized dataset of President Xu's calligraphy, comprising just under 200 images. Our method introduces innovative techniques of font image conditioning and stroke information conditioning, enabling the model to capture the intricate structural elements of Chinese characters. The effectiveness of our approach is demonstrated through a comparison with traditional methods like zi2zi and CalliGAN, with our model achieving comparable performance using significantly smaller datasets and reduced computational resources. This work not only presents a breakthrough in the digital preservation of calligraphic art but also sets a new standard for data-efficient generative modeling in the domain of cultural heritage digitization.
Paper Structure (15 sections, 2 equations, 6 figures, 2 tables)

This paper contains 15 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Samples of ancient artworks of calligraphy. The three rows of characters stand for the style of Regular Script, Semi-cursive Script, and Clerical Script, respectively.
  • Figure 2: Model architecture
  • Figure 3: Font image conditioning
  • Figure 4: GPT2-like stroke information conditioning
  • Figure 5: On the left column are the original artworks, while the right column showcases the generated results. None of the characters tested appear in the fine-tuning dataset.
  • ...and 1 more figures