Table of Contents
Fetching ...

Hierarchical Co-Embedding of Font Shapes and Impression Tags

Yugo Kubota, Kaito Shiku, Seiichi Uchida

Abstract

Font shapes can evoke a wide range of impressions, but the correspondence between fonts and impression descriptions is not one-to-one: some impressions are broadly compatible with diverse styles, whereas others strongly constrain the set of plausible fonts. We refer to this graded constraint strength as style specificity. In this paper, we propose a hyperbolic co-embedding framework that models font--impression correspondence through entailment rather than simple paired alignment. Font images and impression descriptions, represented as single tags or tag sets, are embedded in a shared hyperbolic space with two complementary entailment constraints: impression-to-font entailment and low-to-high style-specificity entailment among impressions. This formulation induces a radial structure in which low style-specificity impressions lie near the origin and high style-specificity impressions lie farther away, yielding an interpretable geometric measure of how strongly an impression constrains font style. Experiments on the MyFonts dataset demonstrate improved bidirectional retrieval over strong one-to-one baselines. In addition, traversal and tag-level analyses show that the learned space captures a coherent progression from ambiguous to more style-specific impressions and provides a meaningful, data-driven quantification of style specificity.

Hierarchical Co-Embedding of Font Shapes and Impression Tags

Abstract

Font shapes can evoke a wide range of impressions, but the correspondence between fonts and impression descriptions is not one-to-one: some impressions are broadly compatible with diverse styles, whereas others strongly constrain the set of plausible fonts. We refer to this graded constraint strength as style specificity. In this paper, we propose a hyperbolic co-embedding framework that models font--impression correspondence through entailment rather than simple paired alignment. Font images and impression descriptions, represented as single tags or tag sets, are embedded in a shared hyperbolic space with two complementary entailment constraints: impression-to-font entailment and low-to-high style-specificity entailment among impressions. This formulation induces a radial structure in which low style-specificity impressions lie near the origin and high style-specificity impressions lie farther away, yielding an interpretable geometric measure of how strongly an impression constrains font style. Experiments on the MyFonts dataset demonstrate improved bidirectional retrieval over strong one-to-one baselines. In addition, traversal and tag-level analyses show that the learned space captures a coherent progression from ambiguous to more style-specific impressions and provides a meaningful, data-driven quantification of style specificity.

Paper Structure

This paper contains 22 sections, 13 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Fonts and their associated impression tags in the MyFonts dataset MyFonts. (a) Fonts are annotated with sets of impression tags. (b) Single tags vary in style specificity, from broadly shared (low) to highly style-specific (high). (c) Adding complementary tags typically increases style specificity.
  • Figure 2: Hyperbolic co-embedding with entailment cones. (a) Feature space where fonts and impression descriptions (tags or tag sets) are co-embedded, capturing impression-to-font entailment and style-specificity entailment among impressions. (b) Radius encodes style specificity from low to high, as the result of hyperbolic co-embedding.
  • Figure 3: Cone aperture and style specificity. Cones widen near the origin for low-specificity tags (e.g., " elegant") and narrow at larger radii for high-specificity tags (e.g., " skinny"), yielding broader vs. more constrained font-style coverage.
  • Figure 4: Overview. We embed a font $F_n$, its impression tag set $S_n$, and a subset $\tilde{S}_n$ in a shared hyperbolic space. Entailment cones place the lower style-specificity embedding of $\tilde{S}_n$ nearer the origin and the higher style-specificity embedding of $S_n$ farther away, enforcing $\tilde{\bm{i}}_n\!\to\!\bm{i}_n$. The cones also impose impression-to-font entailment $\bm{i}_n\!\to\!\bm{f}_n$ through $\mathrm{aper}(\cdot)$ and $\mathrm{ext}(\cdot,\cdot)$.
  • Figure 5: Histograms of distances from the origin $o$ for fonts, impression-tag sets, and impression-tag subsets on the test split. Our method shows a clear radial ordering, whereas Impression-CLIP+ exhibits substantial overlap.
  • ...and 2 more figures