Table of Contents
Fetching ...

Representation Learning for Tablet and Paper Domain Adaptation in Favor of Online Handwriting Recognition

Felix Ott, David Rügamer, Lucas Heublein, Bernd Bischl, Christopher Mutschler

TL;DR

This work tackles cross-domain online handwriting recognition (OnHW) between tablet and paper by learning a shared, domain-invariant representation through supervised domain adaptation. It combines a dual-network architecture with deep metric learning, employing triplet losses and large-margin sampling guided by edit distance, and aligns feature distributions via higher-order moment matching (HoMM) and CORAL across multiple fusion points. The approach is evaluated on sequence-based OnHW datasets, showing substantial improvements, notably HoMM of order 3 at an intermediate fusion point (c=3) achieving the strongest gains (e.g., 13.45% WER and 2.68% CER reductions) when paired with language-model post-processing. The study demonstrates that careful integration of DA losses, dynamic triplet sampling, and moment-based alignment can effectively mitigate cross-domain shifts in IMU-based handwriting data, enabling more robust, cross-device handwriting recognition in practical settings.

Abstract

The performance of a machine learning model degrades when it is applied to data from a similar but different domain than the data it has initially been trained on. The goal of domain adaptation (DA) is to mitigate this domain shift problem by searching for an optimal feature transformation to learn a domain-invariant representation. Such a domain shift can appear in handwriting recognition (HWR) applications where the motion pattern of the hand and with that the motion pattern of the pen is different for writing on paper and on tablet. This becomes visible in the sensor data for online handwriting (OnHW) from pens with integrated inertial measurement units. This paper proposes a supervised DA approach to enhance learning for OnHW recognition between tablet and paper data. Our method exploits loss functions such as maximum mean discrepancy and correlation alignment to learn a domain-invariant feature representation (i.e., similar covariances between tablet and paper features). We use a triplet loss that takes negative samples of the auxiliary domain (i.e., paper samples) to increase the amount of samples of the tablet dataset. We conduct an evaluation on novel sequence-based OnHW datasets (i.e., words) and show an improvement on the paper domain with an early fusion strategy by using pairwise learning.

Representation Learning for Tablet and Paper Domain Adaptation in Favor of Online Handwriting Recognition

TL;DR

This work tackles cross-domain online handwriting recognition (OnHW) between tablet and paper by learning a shared, domain-invariant representation through supervised domain adaptation. It combines a dual-network architecture with deep metric learning, employing triplet losses and large-margin sampling guided by edit distance, and aligns feature distributions via higher-order moment matching (HoMM) and CORAL across multiple fusion points. The approach is evaluated on sequence-based OnHW datasets, showing substantial improvements, notably HoMM of order 3 at an intermediate fusion point (c=3) achieving the strongest gains (e.g., 13.45% WER and 2.68% CER reductions) when paired with language-model post-processing. The study demonstrates that careful integration of DA losses, dynamic triplet sampling, and moment-based alignment can effectively mitigate cross-domain shifts in IMU-based handwriting data, enabling more robust, cross-device handwriting recognition in practical settings.

Abstract

The performance of a machine learning model degrades when it is applied to data from a similar but different domain than the data it has initially been trained on. The goal of domain adaptation (DA) is to mitigate this domain shift problem by searching for an optimal feature transformation to learn a domain-invariant representation. Such a domain shift can appear in handwriting recognition (HWR) applications where the motion pattern of the hand and with that the motion pattern of the pen is different for writing on paper and on tablet. This becomes visible in the sensor data for online handwriting (OnHW) from pens with integrated inertial measurement units. This paper proposes a supervised DA approach to enhance learning for OnHW recognition between tablet and paper data. Our method exploits loss functions such as maximum mean discrepancy and correlation alignment to learn a domain-invariant feature representation (i.e., similar covariances between tablet and paper features). We use a triplet loss that takes negative samples of the auxiliary domain (i.e., paper samples) to increase the amount of samples of the tablet dataset. We conduct an evaluation on novel sequence-based OnHW datasets (i.e., words) and show an improvement on the paper domain with an early fusion strategy by using pairwise learning.
Paper Structure (15 sections, 3 equations, 10 figures, 1 table)

This paper contains 15 sections, 3 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Comparison of accelerometer (1st row), gyroscope (2nd row), magnetometer (3rd row) and force (4th row) data from a sensor pen ott on paper (left) and tablet (right).
  • Figure 2: Detailed method overview: The main network (top pipeline, tablet data) and the auxiliary network (bottom pipeline, paper data) consist of the respective pre-trained architectures with convolutional and bidirectional layers. The weights are fine-tuned with domain adaptation techniques such as the triplet loss at five different fusion points.
  • Figure 3: Number tablet-paper pairs dependent on the ED.
  • Figure 4: Baseline results (WER: dashed, CER: solid, in %, averaged over cross-validation splits) for the InceptionTime (IT) fawaz and our CNN+BiLSTM architectures and different n-gram LMs. Left: WD task. Right: WI task. Legend (tablet = T, paper = P): First notes training set and second notes validation set of the OnHW datasets ott_tist.
  • Figure 5: Contrastive loss, validation: tablet, WD.
  • ...and 5 more figures