HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution

Hua Li; Zhouhui Lian

HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution

Hua Li, Zhouhui Lian

TL;DR

HFH-Font is introduced, a few-shot font synthesis method capable of efficiently generating high-resolution glyph images that can be converted into high-quality vector glyphs and significantly outperforms existing font synthesis approaches.

Abstract

The challenge of automatically synthesizing high-quality vector fonts, particularly for writing systems (e.g., Chinese) consisting of huge amounts of complex glyphs, remains unsolved. Existing font synthesis techniques fall into two categories: 1) methods that directly generate vector glyphs, and 2) methods that initially synthesize glyph images and then vectorize them. However, the first category often fails to construct complete and correct shapes for complex glyphs, while the latter struggles to efficiently synthesize high-resolution (i.e., 1024 $\times$ 1024 or higher) glyph images while preserving local details. In this paper, we introduce HFH-Font, a few-shot font synthesis method capable of efficiently generating high-resolution glyph images that can be converted into high-quality vector glyphs. More specifically, our method employs a diffusion model-based generative framework with component-aware conditioning to learn different levels of style information adaptable to varying input reference sizes. We also design a distillation module based on Score Distillation Sampling for 1-step fast inference, and a style-guided super-resolution module to refine and upscale low-resolution synthesis results. Extensive experiments, including a user study with professional font designers, have been conducted to demonstrate that our method significantly outperforms existing font synthesis approaches. Experimental results show that our method produces high-fidelity, high-resolution raster images which can be vectorized into high-quality vector fonts. Using our method, for the first time, large-scale Chinese vector fonts of a quality comparable to those manually created by professional font designers can be automatically generated.

HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution

TL;DR

Abstract

1024 or higher) glyph images while preserving local details. In this paper, we introduce HFH-Font, a few-shot font synthesis method capable of efficiently generating high-resolution glyph images that can be converted into high-quality vector glyphs. More specifically, our method employs a diffusion model-based generative framework with component-aware conditioning to learn different levels of style information adaptable to varying input reference sizes. We also design a distillation module based on Score Distillation Sampling for 1-step fast inference, and a style-guided super-resolution module to refine and upscale low-resolution synthesis results. Extensive experiments, including a user study with professional font designers, have been conducted to demonstrate that our method significantly outperforms existing font synthesis approaches. Experimental results show that our method produces high-fidelity, high-resolution raster images which can be vectorized into high-quality vector fonts. Using our method, for the first time, large-scale Chinese vector fonts of a quality comparable to those manually created by professional font designers can be automatically generated.

Paper Structure (26 sections, 7 equations, 19 figures, 3 tables)

This paper contains 26 sections, 7 equations, 19 figures, 3 tables.

Introduction
Related Work
Font Generation
Font Generation in Vector Modality
Font Generation in Image Modality
Diffusion Models
Speeding Up Diffusion Models
Method
Conditional LDM with Component-aware Conditioning
Latent diffusion model for font generation
Component-aware conditioning
One-step Generation via Score Distillation Sampling
Towards Higher Resolution
Experiments
Experimental Setup
...and 11 more sections

Figures (19)

Figure 1: An overview of our method. The three segments denote three parts of our framework. Weights from the Stage A low-resolution model are used to initialize the training of Stage B1 and Stage B2, which are parallel to each other. Note that the colored images are only for visualization; only gray-scale glyph images are input into the network.
Figure 2: Our reference selecting procedure that enables the model to deal with different levels of style information.
Figure 3: Comparison of generated results from models trained on the large dataset on unseen fonts seen characters. n_ref denotes the number of style references.
Figure 4: Comparison of generated results from models trained on the large dataset on unseen fonts unseen characters. n_ref denotes the number of style references.
Figure 5: Extra generated results from our model trained on the large dataset. n_ref denotes the number of style references.
...and 14 more figures

HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution

TL;DR

Abstract

HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution

Authors

TL;DR

Abstract

Table of Contents

Figures (19)