Table of Contents
Fetching ...

What Text Design Characterizes Book Genres?

Daichi Haraguchi, Brian Kenji Iwana, Seiichi Uchida

TL;DR

This work investigates how non-verbal information, specifically book genres, can be inferred from text design on book covers and how text design interacts with semantic content. It introduces a Hierarchical Transformer that jointly processes semantic word embeddings ($300$-D) and text-design features (font style, character color, background color, text height, text position) derived from text images on covers. The results show that semantic features suffice for genre classification, but incorporating text design yields modest yet consistent gains, with font style and text position being particularly informative for certain genres. Attention visualizations and ablation analyses reveal which design elements contribute to which genres, offering actionable insights for designers and for data-driven generation of context-aware text designs. The study emphasizes future work on larger datasets and exploring interactions among design features to better capture genre-specific aesthetics.

Abstract

This study analyzes the relationship between non-verbal information (e.g., genres) and text design (e.g., font style, character color, etc.) through the classification of book genres using text design on book covers. Text images have both semantic information about the word itself and other information (non-semantic information or visual design), such as font style, character color, etc. When we read a word printed on some materials, we receive impressions or other information from both the word itself and the visual design. Basically, we can understand verbal information only from semantic information, i.e., the words themselves; however, we can consider that text design is helpful for understanding other additional information (i.e., non-verbal information), such as impressions, genre, etc. To investigate the effect of text design, we analyze text design using words printed on book covers and their genres in two scenarios. First, we attempted to understand the importance of visual design for determining the genre (i.e., non-verbal information) of books by analyzing the differences in the relationship between semantic information/visual design and genres. In the experiment, we found that semantic information is sufficient to determine the genre; however, text design is helpful in adding more discriminative features for book genres. Second, we investigated the effect of each text design on book genres. As a result, we found that each text design characterizes some book genres. For example, font style is useful to add more discriminative features for genres of ``Mystery, Thriller \& Suspense'' and ``Christian books \& Bibles.''

What Text Design Characterizes Book Genres?

TL;DR

This work investigates how non-verbal information, specifically book genres, can be inferred from text design on book covers and how text design interacts with semantic content. It introduces a Hierarchical Transformer that jointly processes semantic word embeddings (-D) and text-design features (font style, character color, background color, text height, text position) derived from text images on covers. The results show that semantic features suffice for genre classification, but incorporating text design yields modest yet consistent gains, with font style and text position being particularly informative for certain genres. Attention visualizations and ablation analyses reveal which design elements contribute to which genres, offering actionable insights for designers and for data-driven generation of context-aware text designs. The study emphasizes future work on larger datasets and exploring interactions among design features to better capture genre-specific aesthetics.

Abstract

This study analyzes the relationship between non-verbal information (e.g., genres) and text design (e.g., font style, character color, etc.) through the classification of book genres using text design on book covers. Text images have both semantic information about the word itself and other information (non-semantic information or visual design), such as font style, character color, etc. When we read a word printed on some materials, we receive impressions or other information from both the word itself and the visual design. Basically, we can understand verbal information only from semantic information, i.e., the words themselves; however, we can consider that text design is helpful for understanding other additional information (i.e., non-verbal information), such as impressions, genre, etc. To investigate the effect of text design, we analyze text design using words printed on book covers and their genres in two scenarios. First, we attempted to understand the importance of visual design for determining the genre (i.e., non-verbal information) of books by analyzing the differences in the relationship between semantic information/visual design and genres. In the experiment, we found that semantic information is sufficient to determine the genre; however, text design is helpful in adding more discriminative features for book genres. Second, we investigated the effect of each text design on book genres. As a result, we found that each text design characterizes some book genres. For example, font style is useful to add more discriminative features for genres of ``Mystery, Thriller \& Suspense'' and ``Christian books \& Bibles.''
Paper Structure (33 sections, 5 figures, 2 tables)

This paper contains 33 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Overview of our experiments. We analyze text design on book covers in two scenarios. First, we compare the effectiveness of text design for book genres to semantic information. Second, we analyze what text design characterizes the book genres.
  • Figure 2: The architecture of the Hierarchical Transformer.
  • Figure 3: The difference between two confusion matrices between the condition of removing features and the full model. Each bar plot shows the variance of each text design element. Red dotted boxes show the three highest variances. Blue dotted boxes show the three lowest variances.
  • Figure 4: Examples of visualization of attention. The top of each subfigure is extracted from the baseline. The bottom of each subfigure is extracted from the full model. The bottom row of each subfigure also shows the attention to the design features. Deeper red shows strong attention. (a) to (c) are correctly predicted samples by ours. (d) is not a wrong sample by ours.
  • Figure 5: Examples of book covers. The values of the left shows the diagonal elements on the confusion matrix shown in Figure \ref{['fig:confusion']}.