Table of Contents
Fetching ...

DNA: Dual-branch Network with Adaptation for Open-Set Online Handwriting Generation

Tsai-Ling Huang, Nhat-Tuong Do-Tran, Ngoc-Hoang-Lam Le, Hong-Han Shuai, Ching-Chun Huang

TL;DR

The paper tackles unseen online handwriting generation by introducing DNA, a Dual-branch Network with Adaptation that separately handles writer style and character content. The adaptive style branch extracts stroke-driven style patterns, while the adaptive content branch uses local (structure/components) and global (texture) encoders with cross-attention to generalize to unseen characters. A two-stage training strategy, including a spacing loss for inter-stroke alignment and content-guiding losses, yields state-of-the-art results on Traditional Chinese and Japanese OHG benchmarks and improves downstream recognition. This approach offers practical benefits for data synthesis and HTR generalization in glyph-rich languages and demonstrates efficient generation compared to diffusion-based methods.

Abstract

Online handwriting generation (OHG) enhances handwriting recognition models by synthesizing diverse, human-like samples. However, existing OHG methods struggle to generate unseen characters, particularly in glyph-based languages like Chinese, limiting their real-world applicability. In this paper, we introduce our method for OHG, where the writer's style and the characters generated during testing are unseen during training. To tackle this challenge, we propose a Dual-branch Network with Adaptation (DNA), which comprises an adaptive style branch and an adaptive content branch. The style branch learns stroke attributes such as writing direction, spacing, placement, and flow to generate realistic handwriting. Meanwhile, the content branch is designed to generalize effectively to unseen characters by decomposing character content into structural information and texture details, extracted via local and global encoders, respectively. Extensive experiments demonstrate that our DNA model is well-suited for the unseen OHG setting, achieving state-of-the-art performance.

DNA: Dual-branch Network with Adaptation for Open-Set Online Handwriting Generation

TL;DR

The paper tackles unseen online handwriting generation by introducing DNA, a Dual-branch Network with Adaptation that separately handles writer style and character content. The adaptive style branch extracts stroke-driven style patterns, while the adaptive content branch uses local (structure/components) and global (texture) encoders with cross-attention to generalize to unseen characters. A two-stage training strategy, including a spacing loss for inter-stroke alignment and content-guiding losses, yields state-of-the-art results on Traditional Chinese and Japanese OHG benchmarks and improves downstream recognition. This approach offers practical benefits for data synthesis and HTR generalization in glyph-rich languages and demonstrates efficient generation compared to diffusion-based methods.

Abstract

Online handwriting generation (OHG) enhances handwriting recognition models by synthesizing diverse, human-like samples. However, existing OHG methods struggle to generate unseen characters, particularly in glyph-based languages like Chinese, limiting their real-world applicability. In this paper, we introduce our method for OHG, where the writer's style and the characters generated during testing are unseen during training. To tackle this challenge, we propose a Dual-branch Network with Adaptation (DNA), which comprises an adaptive style branch and an adaptive content branch. The style branch learns stroke attributes such as writing direction, spacing, placement, and flow to generate realistic handwriting. Meanwhile, the content branch is designed to generalize effectively to unseen characters by decomposing character content into structural information and texture details, extracted via local and global encoders, respectively. Extensive experiments demonstrate that our DNA model is well-suited for the unseen OHG setting, achieving state-of-the-art performance.

Paper Structure

This paper contains 13 sections, 10 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Illustration of the online handwriting generation setting. Our proposed DNA model is trained under the SWSC (Seen-Writer-Seen-Characters) Level 1 setting. During testing, we evaluate DNA’s ability to generate online handwriting characters across two other levels: UWSC (Unseen-Writer-Seen-Characters) and UWUC (Unseen-Writer-Unseen-Characters). Additionally, we compare DNA’s performance with the state-of-the-art UWSC method, SDT SDT. The blue squares highlight incorrect strokes generated by SDT, particularly at the UWUC Level 3 setting.
  • Figure 2: Illustration of our proposed DNA. The style branch employs a style encoder $E_s$, followed by style and glyph heads, to extract style features $S_w^s$ and $G_w^s$. The content branch uses a local encoder $E^l_c$ for structural features and a global encoder $E^g_c$ for texture features. Structural features enhance texture features to form the enriched character content embedding $Z^{c}_i$. Along with previous point states $\{p^{w,i}_j\}_{j=1}^{t-1}$, this forms the point sequence $P^c_i$. The decoder then integrates $P^c_i$ with $S_w^s$ and $G_w^s$ to generate the online handwriting trajectory. Note that the spacing loss $\mathcal{L}_{sp}$ is applied only in the second training stage to ensure fluid handwriting.
  • Figure 3: Schematic diagram illustrating the structural and component decomposition of Chinese characters.
  • Figure 4: Stroke errors after Stage 1 training are highlighted by bounding boxes in the 'Generated' samples, while the 'Correct layout' illustrates the desired stroke positions.
  • Figure 5: Qualitative comparison of synthesized handwriting from offline and online methods on the UWSC and UWUC testing sets. Red boxes mark areas where content is incorrect. Blue boxes outline the stylistic expression of some strokes.
  • ...and 1 more figures