Font Style Interpolation with Diffusion Models

Tetta Kondo; Shumpei Takezaki; Daichi Haraguchi; Seiichi Uchida

Font Style Interpolation with Diffusion Models

Tetta Kondo, Shumpei Takezaki, Daichi Haraguchi, Seiichi Uchida

TL;DR

This work tackles font style interpolation by leveraging diffusion models to blend two reference fonts. It introduces three interpolation strategies—image-blending, condition-blending, and noise-blending—demonstrating their ability to produce both expected and serendipitous font styles. Through qualitative and quantitative evaluations, including character recognition and stroke-width interpolation tests, the methods show competitive readability and diverse outputs, with FANnet serving as a conservative baseline. The approach offers a flexible, pixel-domain framework with potential applicability to broader image domains and future improvements in latent-space smoothness and set-wise interpolation across alphabets.

Abstract

Fonts have huge variations in their styles and give readers different impressions. Therefore, generating new fonts is worthy of giving new impressions to readers. In this paper, we employ diffusion models to generate new font styles by interpolating a pair of reference fonts with different styles. More specifically, we propose three different interpolation approaches, image-blending, condition-blending, and noise-blending, with the diffusion models. We perform qualitative and quantitative experimental analyses to understand the style generation ability of the three approaches. According to experimental results, three proposed approaches can generate not only expected font styles but also somewhat serendipitous font styles. We also compare the approaches with a state-of-the-art style-conditional Latin-font generative network model to confirm the validity of using the diffusion models for the style interpolation task.

Font Style Interpolation with Diffusion Models

TL;DR

Abstract

Paper Structure (27 sections, 4 equations, 8 figures, 3 tables)

This paper contains 27 sections, 4 equations, 8 figures, 3 tables.

Introduction
Related Work
Font Style Features
Font Generation by Diffusion Models
Three Approaches for Font Style Interpolation with Diffusion Models
Conditional Diffusion Model for Character Image Generation
Image-Blending Approach
Condition-Blending Approach
Noise-Blending Approach
Experimental Results
Datasets
Implementation Details
Diffusion Model Architecture
Training Diffusion Models
Style Feature Extraction
...and 12 more sections

Figures (8)

Figure 1: Examples of font style interpolation by our approach, called noise-blending. The reference images $\mathbf{r}_1, \mathbf{r}_2$ in the top row: Google Fonts. Other rows: MyFonts.
Figure 2: Overview of the denoising process and our three approaches for font style interpolation: (a) Image blending. (b) Condition blending. (c) Noise blending. For simplicity, several operations (constant multiplications and addition of stochastic perturbation) are omitted in the denoising process. In (a)-(c), unimportant conditions $t, c$ are also omitted
Figure 3: Character images in various font styles.
Figure 4: (a) Overview of FANnet Roy_2020_CVPR, which is trained to internally extract the style feature $\mathbf{s}$. (b) Our comparative model by FANnet. A blended style feature is used to generate an interpolated image.
Figure 5: Interpolation between the light and bold versions of the same font family in the GoogleFonts dataset. The medium version (GT) is shown as a quasi-ground-truth.
...and 3 more figures

Font Style Interpolation with Diffusion Models

TL;DR

Abstract

Font Style Interpolation with Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (8)