Capturing More: Learning Multi-Domain Representations for Robust Online Handwriting Verification
Peirong Zhang, Kai Ding, Lianwen Jin
TL;DR
This work tackles the limitation of temporal-only representations in online handwriting verification (OHV) by introducing SPECTRUM, a temporal-frequency multi-domain model. It realizes micro-to-macro multi-domain integration (M3I) through a multi-scale interactor for fine-grained temporal-frequency interaction and a self-gated fusion module for global feature balancing, complemented by a multi-domain distance-based verifier (MDV) that fuses DTW-based temporal distances with Euclidean frequency distances. Experiments on MSDS-ChS, MSDS-TDS, and DeepSignDB demonstrate consistent improvements over state-of-the-art OHV methods, and reveal that combining multiple handwritten biometrics (e.g., Chinese signatures and token digit strings) further enhances discrimination. The findings suggest multi-domain learning across both feature and biometric domains offers a robust path to improve OHV in real-world applications, with public code available for reproducibility. Key components include the use of $STFT$-based spectrograms, a micro-scale interactor that applies $1\times1$ convolutions to temporal paths and a learnable 1D DFT on frequency paths, and the MDV decision mechanism that adaptively weights temporal penalties by frequency cues.
Abstract
In this paper, we propose SPECTRUM, a temporal-frequency synergistic model that unlocks the untapped potential of multi-domain representation learning for online handwriting verification (OHV). SPECTRUM comprises three core components: (1) a multi-scale interactor that finely combines temporal and frequency features through dual-modal sequence interaction and multi-scale aggregation, (2) a self-gated fusion module that dynamically integrates global temporal and frequency features via self-driven balancing. These two components work synergistically to achieve micro-to-macro spectral-temporal integration. (3) A multi-domain distance-based verifier then utilizes both temporal and frequency representations to improve discrimination between genuine and forged handwriting, surpassing conventional temporal-only approaches. Extensive experiments demonstrate SPECTRUM's superior performance over existing OHV methods, underscoring the effectiveness of temporal-frequency multi-domain learning. Furthermore, we reveal that incorporating multiple handwritten biometrics fundamentally enhances the discriminative power of handwriting representations and facilitates verification. These findings not only validate the efficacy of multi-domain learning in OHV but also pave the way for future research in multi-domain approaches across both feature and biometric domains. Code is publicly available at https://github.com/NiceRingNode/SPECTRUM.
