Font Style Interpolation with Diffusion Models
Tetta Kondo, Shumpei Takezaki, Daichi Haraguchi, Seiichi Uchida
TL;DR
This work tackles font style interpolation by leveraging diffusion models to blend two reference fonts. It introduces three interpolation strategies—image-blending, condition-blending, and noise-blending—demonstrating their ability to produce both expected and serendipitous font styles. Through qualitative and quantitative evaluations, including character recognition and stroke-width interpolation tests, the methods show competitive readability and diverse outputs, with FANnet serving as a conservative baseline. The approach offers a flexible, pixel-domain framework with potential applicability to broader image domains and future improvements in latent-space smoothness and set-wise interpolation across alphabets.
Abstract
Fonts have huge variations in their styles and give readers different impressions. Therefore, generating new fonts is worthy of giving new impressions to readers. In this paper, we employ diffusion models to generate new font styles by interpolating a pair of reference fonts with different styles. More specifically, we propose three different interpolation approaches, image-blending, condition-blending, and noise-blending, with the diffusion models. We perform qualitative and quantitative experimental analyses to understand the style generation ability of the three approaches. According to experimental results, three proposed approaches can generate not only expected font styles but also somewhat serendipitous font styles. We also compare the approaches with a state-of-the-art style-conditional Latin-font generative network model to confirm the validity of using the diffusion models for the style interpolation task.
