Image-based Freeform Handwriting Authentication with Energy-oriented Self-Supervised Learning
Jingyao Wang, Luntian Mou, Changwen Zheng, Wen Gao
TL;DR
This work tackles robust freeform handwriting authentication under severe damage, high-dimensional features, and limited supervision. It introduces SherlockNet, an energy-oriented two-branch contrastive self-supervised framework with four stages: pre-processing via an energy operator, generalized pre-training with adaptive patch-weighted contrastive learning, personalized fine-tuning on few labels, and practical deployment through modular APIs. A new EN-HA dataset simulates forgery and damage to mirror real-world conditions, and extensive experiments across six benchmarks demonstrate strong robustness and efficiency, often surpassing state-of-the-art baselines without requiring annotated data. The approach enables accurate writer verification in messy, unconstrained handwriting scenarios and offers practical deployment potential for archival, security, and forensic applications.
Abstract
Freeform handwriting authentication verifies a person's identity from their writing style and habits in messy handwriting data. This technique has gained widespread attention in recent years as a valuable tool for various fields, e.g., fraud prevention and cultural heritage protection. However, it still remains a challenging task in reality due to three reasons: (i) severe damage, (ii) complex high-dimensional features, and (iii) lack of supervision. To address these issues, we propose SherlockNet, an energy-oriented two-branch contrastive self-supervised learning framework for robust and fast freeform handwriting authentication. It consists of four stages: (i) pre-processing: converting manuscripts into energy distributions using a novel plug-and-play energy-oriented operator to eliminate the influence of noise; (ii) generalized pre-training: learning general representation through two-branch momentum-based adaptive contrastive learning with the energy distributions, which handles the high-dimensional features and spatial dependencies of handwriting; (iii) personalized fine-tuning: calibrating the learned knowledge using a small amount of labeled data from downstream tasks; and (iv) practical application: identifying individual handwriting from scrambled, missing, or forged data efficiently and conveniently. Considering the practicality, we construct EN-HA, a novel dataset that simulates data forgery and severe damage in real applications. Finally, we conduct extensive experiments on six benchmark datasets including our EN-HA, and the results prove the robustness and efficiency of SherlockNet.
