Unsupervised Attention Regularization Based Domain Adaptation for Oracle Character Recognition
Mei Wang, Weihong Deng, Jiani Hu, Sen Su
TL;DR
This work tackles the domain gap between handprinted and scanned oracle characters by introducing unsupervised attention regularization network (UARN). UARN blends adversarial domain alignment, pseudo-labeling, and novel attention regularizations—attention consistency under flipping and attention discriminability between pseudo and confusing classes—grounded in CAM-based explanations. Empirical results on Oracle-241 (and digits datasets) show UARN achieving state-of-the-art target accuracy and improved interpretability, with ablation and sensitivity analyses confirming the importance of each component. The approach offers practical gains for archaeology-driven OCR and highlights the value of perceptually plausible regularization in cross-domain recognition.
Abstract
The study of oracle characters plays an important role in Chinese archaeology and philology. However, the difficulty of collecting and annotating real-world scanned oracle characters hinders the development of oracle character recognition. In this paper, we develop a novel unsupervised domain adaptation (UDA) method, i.e., unsupervised attention regularization net?work (UARN), to transfer recognition knowledge from labeled handprinted oracle characters to unlabeled scanned data. First, we experimentally prove that existing UDA methods are not always consistent with human priors and cannot achieve optimal performance on the target domain. For these oracle characters with flip-insensitivity and high inter-class similarity, model interpretations are not flip-consistent and class-separable. To tackle this challenge, we take into consideration visual perceptual plausibility when adapting. Specifically, our method enforces attention consistency between the original and flipped images to achieve the model robustness to flipping. Simultaneously, we constrain attention separability between the pseudo class and the most confusing class to improve the model discriminability. Extensive experiments demonstrate that UARN shows better interpretability and achieves state-of-the-art performance on Oracle-241 dataset, substantially outperforming the previously structure-texture separation network by 8.5%.
