Table of Contents
Fetching ...

Self-Supervised Learning Based Handwriting Verification

Mihir Chauhan, Mohammad Abuzar Hashemi, Abhishek Satbhai, Mir Basheer Ali, Bina Ramamurthy, Mingchen Gao, Siwei Lyu, Sargur Srihari

TL;DR

This work tackles handwriting verification under limited labeled data by applying self-supervised learning (SSL) to learn robust writer-discriminative representations. It evaluates a broad spectrum of generative SSL (AR AIM, Flow, MAE, VAE, BiGAN) and contrastive SSL (MoCo, SimCLR, SimSiam, FastSiam, DINO, Barlow Twins, VicReg, etc.) on the CEDAR-AND dataset, with ResNet-18 encoders for downstream verification. The strongest downstream performance comes from a ResNet-18 encoder pre-trained with a Variational Auto-Encoder, achieving 76.3% accuracy, and from VICReg-based contrastive pretraining, achieving 78% accuracy, corresponding to relative gains of 6.7% and 9% over a supervised ResNet-18 baseline with 10% writer labels. A cosine-similarity separation metric (intra vs inter writer) correlates with improved verification accuracy, supporting SSL-HV’s effectiveness. Overall, SSL pretraining yields robust handwriting representations that improve writer verification with limited annotations, and future work could leverage full manuscripts and larger unlabeled datasets for broader writer-discrimination tasks.

Abstract

We present SSL-HV: Self-Supervised Learning approaches applied to the task of Handwriting Verification. This task involves determining whether a given pair of handwritten images originate from the same or different writer distribution. We have compared the performance of multiple generative, contrastive SSL approaches against handcrafted feature extractors and supervised learning on CEDAR AND dataset. We show that ResNet based Variational Auto-Encoder (VAE) outperforms other generative approaches achieving 76.3% accuracy, while ResNet-18 fine-tuned using Variance-Invariance-Covariance Regularization (VICReg) outperforms other contrastive approaches achieving 78% accuracy. Using a pre-trained VAE and VICReg for the downstream task of writer verification we observed a relative improvement in accuracy of 6.7% and 9% over ResNet-18 supervised baseline with 10% writer labels.

Self-Supervised Learning Based Handwriting Verification

TL;DR

This work tackles handwriting verification under limited labeled data by applying self-supervised learning (SSL) to learn robust writer-discriminative representations. It evaluates a broad spectrum of generative SSL (AR AIM, Flow, MAE, VAE, BiGAN) and contrastive SSL (MoCo, SimCLR, SimSiam, FastSiam, DINO, Barlow Twins, VicReg, etc.) on the CEDAR-AND dataset, with ResNet-18 encoders for downstream verification. The strongest downstream performance comes from a ResNet-18 encoder pre-trained with a Variational Auto-Encoder, achieving 76.3% accuracy, and from VICReg-based contrastive pretraining, achieving 78% accuracy, corresponding to relative gains of 6.7% and 9% over a supervised ResNet-18 baseline with 10% writer labels. A cosine-similarity separation metric (intra vs inter writer) correlates with improved verification accuracy, supporting SSL-HV’s effectiveness. Overall, SSL pretraining yields robust handwriting representations that improve writer verification with limited annotations, and future work could leverage full manuscripts and larger unlabeled datasets for broader writer-discrimination tasks.

Abstract

We present SSL-HV: Self-Supervised Learning approaches applied to the task of Handwriting Verification. This task involves determining whether a given pair of handwritten images originate from the same or different writer distribution. We have compared the performance of multiple generative, contrastive SSL approaches against handcrafted feature extractors and supervised learning on CEDAR AND dataset. We show that ResNet based Variational Auto-Encoder (VAE) outperforms other generative approaches achieving 76.3% accuracy, while ResNet-18 fine-tuned using Variance-Invariance-Covariance Regularization (VICReg) outperforms other contrastive approaches achieving 78% accuracy. Using a pre-trained VAE and VICReg for the downstream task of writer verification we observed a relative improvement in accuracy of 6.7% and 9% over ResNet-18 supervised baseline with 10% writer labels.
Paper Structure (10 sections, 2 equations, 2 figures, 2 tables)

This paper contains 10 sections, 2 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: The overall framework for Self-Supervised based Handwriting Verification (SSL-HV)
  • Figure 2: Data Augmentation Views from an example original image of word "AND".