Table of Contents
Fetching ...

Towards Writing Style Adaptation in Handwriting Recognition

Jan Kohút, Michal Hradiš, Martin Kišš

TL;DR

This paper tackles handwriting recognition across many writer styles by introducing WS-Net, a CNN+RNN architecture augmented with a Writer Style Block (WSB) that conditions normalization on writer-style embeddings derived from a Writer Style Identifier. The approach enables explicit style-aware processing via adaptive instance normalization, and the authors explore both randomly initialized and contrastively pre-trained writer embeddings. Key findings show that pre-trained writer-style embeddings consistently improve transcription accuracy in writer-dependent settings, whereas learning embeddings from scratch is less reliable; in writer-independent settings, simple baselines like finetuning can outperform WS-Net, highlighting the need for a more robust mechanism (e.g., attention-based aggregation) to handle unseen writers. The work demonstrates the feasibility of explicit style adaptation in handwriting recognition and provides a foundation for future style-aware or attention-guided approaches to bridge writer-independent performance gaps with minimal overhead.

Abstract

One of the challenges of handwriting recognition is to transcribe a large number of vastly different writing styles. State-of-the-art approaches do not explicitly use information about the writer's style, which may be limiting overall accuracy due to various ambiguities. We explore models with writer-dependent parameters which take the writer's identity as an additional input. The proposed models can be trained on datasets with partitions likely written by a single author (e.g. single letter, diary, or chronicle). We propose a Writer Style Block (WSB), an adaptive instance normalization layer conditioned on learned embeddings of the partitions. We experimented with various placements and settings of WSB and contrastively pre-trained embeddings. We show that our approach outperforms a baseline with no WSB in a writer-dependent scenario and that it is possible to estimate embeddings for new writers. However, domain adaptation using simple fine-tuning in a writer-independent setting provides superior accuracy at a similar computational cost. The proposed approach should be further investigated in terms of training stability and embedding regularization to overcome such a baseline.

Towards Writing Style Adaptation in Handwriting Recognition

TL;DR

This paper tackles handwriting recognition across many writer styles by introducing WS-Net, a CNN+RNN architecture augmented with a Writer Style Block (WSB) that conditions normalization on writer-style embeddings derived from a Writer Style Identifier. The approach enables explicit style-aware processing via adaptive instance normalization, and the authors explore both randomly initialized and contrastively pre-trained writer embeddings. Key findings show that pre-trained writer-style embeddings consistently improve transcription accuracy in writer-dependent settings, whereas learning embeddings from scratch is less reliable; in writer-independent settings, simple baselines like finetuning can outperform WS-Net, highlighting the need for a more robust mechanism (e.g., attention-based aggregation) to handle unseen writers. The work demonstrates the feasibility of explicit style adaptation in handwriting recognition and provides a foundation for future style-aware or attention-guided approaches to bridge writer-independent performance gaps with minimal overhead.

Abstract

One of the challenges of handwriting recognition is to transcribe a large number of vastly different writing styles. State-of-the-art approaches do not explicitly use information about the writer's style, which may be limiting overall accuracy due to various ambiguities. We explore models with writer-dependent parameters which take the writer's identity as an additional input. The proposed models can be trained on datasets with partitions likely written by a single author (e.g. single letter, diary, or chronicle). We propose a Writer Style Block (WSB), an adaptive instance normalization layer conditioned on learned embeddings of the partitions. We experimented with various placements and settings of WSB and contrastively pre-trained embeddings. We show that our approach outperforms a baseline with no WSB in a writer-dependent scenario and that it is possible to estimate embeddings for new writers. However, domain adaptation using simple fine-tuning in a writer-independent setting provides superior accuracy at a similar computational cost. The proposed approach should be further investigated in terms of training stability and embedding regularization to overcome such a baseline.
Paper Structure (14 sections, 4 equations, 6 figures, 1 table)

This paper contains 14 sections, 4 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Our proposed Writer Style Block (WSB) learns to utilize various writer styles based on writer-style identifiers (WSI).
  • Figure 2: Our proposed neural network (WS-Net) consists of a convolutional part (CNN), a recurrent part (LSTM), and Writer Style Block (WSB).
  • Figure 3: Left, samples from the CzechHWR dataset. Right, representative words of 19 writers from Handwriting Adaptation Dataset.
  • Figure 4: Character error rate (CER) for Single AdaIN, All AdaIN, and the baseline on the test set. The graphs show CER for different embedding dimensions (ED) and for different initialization: randomly initialized (left) and pre-trained (right).
  • Figure 5: Character error rate (CER) measured on various testing and training clusters for Single AdaIN with randomly initialized embeddings (top) and pre-trained embeddings (bottom).
  • ...and 1 more figures