Font Impression Estimation in the Wild

Kazuki Kitajima; Daichi Haraguchi; Seiichi Uchida

Font Impression Estimation in the Wild

Kazuki Kitajima, Daichi Haraguchi, Seiichi Uchida

TL;DR

The paper tackles the challenge of estimating font impressions from in-the-wild word images where impression annotations are noisy and incomplete. It proposes an exemplar-based framework that trains an $N$-class font classifier to map images to exemplar fonts, then ensembles the tags from the top $\tilde{n}$ exemplars and retains impressions appearing in at least $p$ exemplars, leveraging synthetic word images from $SynthTiger$ for training. The approach outperforms traditional multi-label CNN baselines in macro-F1 and demonstrates robustness to missing labels, with a large-scale application to 207{,}572 book covers revealing meaningful correlations between font impressions and genres. These findings suggest practical utility for perceptual font analysis and design guidance, while highlighting hyperparameter sensitivity and opportunities to integrate correlations into design generators.

Abstract

This paper addresses the challenging task of estimating font impressions from real font images. We use a font dataset with annotation about font impressions and a convolutional neural network (CNN) framework for this task. However, impressions attached to individual fonts are often missing and noisy because of the subjective characteristic of font impression annotation. To realize stable impression estimation even with such a dataset, we propose an exemplar-based impression estimation approach, which relies on a strategy of ensembling impressions of exemplar fonts that are similar to the input image. In addition, we train CNN with synthetic font images that mimic scanned word images so that CNN estimates impressions of font images in the wild. We evaluate the basic performance of the proposed estimation method quantitatively and qualitatively. Then, we conduct a correlation analysis between book genres and font impressions on real book cover images; it is important to note that this analysis is only possible with our impression estimation method. The analysis reveals various trends in the correlation between them - this fact supports a hypothesis that book cover designers carefully choose a font for a book cover considering the impression given by the font.

Font Impression Estimation in the Wild

TL;DR

-class font classifier to map images to exemplar fonts, then ensembles the tags from the top

exemplars and retains impressions appearing in at least

exemplars, leveraging synthetic word images from

for training. The approach outperforms traditional multi-label CNN baselines in macro-F1 and demonstrates robustness to missing labels, with a large-scale application to 207{,}572 book covers revealing meaningful correlations between font impressions and genres. These findings suggest practical utility for perceptual font analysis and design guidance, while highlighting hyperparameter sensitivity and opportunities to integrate correlations into design generators.

Abstract

Paper Structure (18 sections, 9 figures, 1 table)

This paper contains 18 sections, 9 figures, 1 table.

Introduction
Related Work
Impressions of Fonts
Font Usage Analysis
Font Dataset
Exemplar-based Impression Estimation
Impression Estimation Experiment on Synthetic Word Images
Synthetic Word Images
Font Classification Model and Its Training
Comparative Models
Evaluation Metrics and Hyperparameters
Quantitative Evaluation Results
Qualitative Evaluation Results
Application: Correlation Analysis Between Book Genres and Font Impression on Book Covers
Purpose
...and 3 more sections

Figures (9)

Figure 1: Fonts and their impressions tags, provided by 1001freefonts.com.
Figure 2: Two approaches for the font impression estimation task. (a) Multi-label classification, and (b) Our exemplar-based impression estimation. This is an example of the conditions $\theta = 0.8$, $\tilde{n} = 2$ and $p = 2$.
Figure 3: The number of fonts with the 100 most frequent impression tags. Note that the vertical axis is logarithmic and the 16 tags in red are ignored due to the existence of near-identical tags.
Figure 4: Examples of the synthetic word images for training the font classifier. Images are generated by SynthTiger yim2021synthtiger.
Figure 5: F1 scores under the different hyperparameter values, $\tilde{n}$ and $p$.
...and 4 more figures

Font Impression Estimation in the Wild

TL;DR

Abstract

Font Impression Estimation in the Wild

Authors

TL;DR

Abstract

Table of Contents

Figures (9)