Uncertainty-aware Evaluation of Time-Series Classification for Online Handwriting Recognition with Domain Shift

Andreas Klaß; Sven M. Lorenz; Martin W. Lauer-Schmaltz; David Rügamer; Bernd Bischl; Christopher Mutschler; Felix Ott

Uncertainty-aware Evaluation of Time-Series Classification for Online Handwriting Recognition with Domain Shift

Andreas Klaß, Sven M. Lorenz, Martin W. Lauer-Schmaltz, David Rügamer, Bernd Bischl, Christopher Mutschler, Felix Ott

TL;DR

This work tackles uncertainty quantification for online handwriting recognition under domain shift between right- and left-handed writers. It compares SWAG and Deep Ensembles as approximate Bayesian inference methods for spatio-temporal time-series, and applies both Kwon et al. style uncertainty decomposition and information-theoretic measures, including $AU$, $EU$, $TU$, and $MI$, to analyze predictive reliability. The authors evaluate calibration via Confidence, ECE, and reliability diagrams across lowercase, uppercase, and combined character tasks, revealing calibration gaps under domain shift and the utility of joint uncertainty and domain-adaptation signals. They find SWAG and Deep Ensembles yield similar accuracy with real-time applicability, and uncertainty metrics can help detect out-of-distribution samples and guide domain adaptation, with the code to be released.

Abstract

For many applications, analyzing the uncertainty of a machine learning model is indispensable. While research of uncertainty quantification (UQ) techniques is very advanced for computer vision applications, UQ methods for spatio-temporal data are less studied. In this paper, we focus on models for online handwriting recognition, one particular type of spatio-temporal data. The data is observed from a sensor-enhanced pen with the goal to classify written characters. We conduct a broad evaluation of aleatoric (data) and epistemic (model) UQ based on two prominent techniques for Bayesian inference, Stochastic Weight Averaging-Gaussian (SWAG) and Deep Ensembles. Next to a better understanding of the model, UQ techniques can detect out-of-distribution data and domain shifts when combining right-handed and left-handed writers (an underrepresented group).

Uncertainty-aware Evaluation of Time-Series Classification for Online Handwriting Recognition with Domain Shift

TL;DR

, and

, to analyze predictive reliability. The authors evaluate calibration via Confidence, ECE, and reliability diagrams across lowercase, uppercase, and combined character tasks, revealing calibration gaps under domain shift and the utility of joint uncertainty and domain-adaptation signals. They find SWAG and Deep Ensembles yield similar accuracy with real-time applicability, and uncertainty metrics can help detect out-of-distribution samples and guide domain adaptation, with the code to be released.

Abstract

Paper Structure (44 sections, 9 equations, 25 figures, 3 tables)

This paper contains 44 sections, 9 equations, 25 figures, 3 tables.

Introduction
Approximate Bayesian Inference Techniques.
Decomposing Uncertainty.
UQ for OnHW.
Contribution.
Related Work
UQ for Spatio-Temporal Reasoning
Online Handwriting Recognition
Methodological Background
Bayesian Model Averaging
Approximate Bayesian Inference
Stochastic Weight Averaging-Gaussian (SWAG).
Deep Ensembles.
Uncertainty Decomposition
Uncertainty Decomposition based on [Kwon et al.]
...and 29 more sections

Figures (25)

Figure 1: Mutual information.
Figure 2: Entropy.
Figure 3: Legend.
Figure 5: Evaluated on right-handed writers data.
Figure 6: Evaluated on right-handed writers data.
...and 20 more figures

Uncertainty-aware Evaluation of Time-Series Classification for Online Handwriting Recognition with Domain Shift

TL;DR

Abstract

Uncertainty-aware Evaluation of Time-Series Classification for Online Handwriting Recognition with Domain Shift

Authors

TL;DR

Abstract

Table of Contents

Figures (25)