Table of Contents
Fetching ...

Unsupervised Attention Regularization Based Domain Adaptation for Oracle Character Recognition

Mei Wang, Weihong Deng, Jiani Hu, Sen Su

TL;DR

This work tackles the domain gap between handprinted and scanned oracle characters by introducing unsupervised attention regularization network (UARN). UARN blends adversarial domain alignment, pseudo-labeling, and novel attention regularizations—attention consistency under flipping and attention discriminability between pseudo and confusing classes—grounded in CAM-based explanations. Empirical results on Oracle-241 (and digits datasets) show UARN achieving state-of-the-art target accuracy and improved interpretability, with ablation and sensitivity analyses confirming the importance of each component. The approach offers practical gains for archaeology-driven OCR and highlights the value of perceptually plausible regularization in cross-domain recognition.

Abstract

The study of oracle characters plays an important role in Chinese archaeology and philology. However, the difficulty of collecting and annotating real-world scanned oracle characters hinders the development of oracle character recognition. In this paper, we develop a novel unsupervised domain adaptation (UDA) method, i.e., unsupervised attention regularization net?work (UARN), to transfer recognition knowledge from labeled handprinted oracle characters to unlabeled scanned data. First, we experimentally prove that existing UDA methods are not always consistent with human priors and cannot achieve optimal performance on the target domain. For these oracle characters with flip-insensitivity and high inter-class similarity, model interpretations are not flip-consistent and class-separable. To tackle this challenge, we take into consideration visual perceptual plausibility when adapting. Specifically, our method enforces attention consistency between the original and flipped images to achieve the model robustness to flipping. Simultaneously, we constrain attention separability between the pseudo class and the most confusing class to improve the model discriminability. Extensive experiments demonstrate that UARN shows better interpretability and achieves state-of-the-art performance on Oracle-241 dataset, substantially outperforming the previously structure-texture separation network by 8.5%.

Unsupervised Attention Regularization Based Domain Adaptation for Oracle Character Recognition

TL;DR

This work tackles the domain gap between handprinted and scanned oracle characters by introducing unsupervised attention regularization network (UARN). UARN blends adversarial domain alignment, pseudo-labeling, and novel attention regularizations—attention consistency under flipping and attention discriminability between pseudo and confusing classes—grounded in CAM-based explanations. Empirical results on Oracle-241 (and digits datasets) show UARN achieving state-of-the-art target accuracy and improved interpretability, with ablation and sensitivity analyses confirming the importance of each component. The approach offers practical gains for archaeology-driven OCR and highlights the value of perceptually plausible regularization in cross-domain recognition.

Abstract

The study of oracle characters plays an important role in Chinese archaeology and philology. However, the difficulty of collecting and annotating real-world scanned oracle characters hinders the development of oracle character recognition. In this paper, we develop a novel unsupervised domain adaptation (UDA) method, i.e., unsupervised attention regularization net?work (UARN), to transfer recognition knowledge from labeled handprinted oracle characters to unlabeled scanned data. First, we experimentally prove that existing UDA methods are not always consistent with human priors and cannot achieve optimal performance on the target domain. For these oracle characters with flip-insensitivity and high inter-class similarity, model interpretations are not flip-consistent and class-separable. To tackle this challenge, we take into consideration visual perceptual plausibility when adapting. Specifically, our method enforces attention consistency between the original and flipped images to achieve the model robustness to flipping. Simultaneously, we constrain attention separability between the pseudo class and the most confusing class to improve the model discriminability. Extensive experiments demonstrate that UARN shows better interpretability and achieves state-of-the-art performance on Oracle-241 dataset, substantially outperforming the previously structure-texture separation network by 8.5%.
Paper Structure (20 sections, 12 equations, 14 figures, 8 tables, 1 algorithm)

This paper contains 20 sections, 12 equations, 14 figures, 8 tables, 1 algorithm.

Figures (14)

  • Figure 1: An illustration of attention maps for scanned oracle characters. (a) Flipping an image does not flip the attention map in the existing UDA method, while our UARN significantly improves attention consistency. (b) Existing UDA method pays attention to similar regions when regarding the relevant pixels for classes “bull” and “son”, while our UARN makes the attention map separable and tells the confusing class apart on the target domain.
  • Figure 2: Illustration of UARN. We utilize adversarial learning and pseudo labeling to learn domain-invariant features, and simultaneously enforce the consistency and discriminability of attention heatmaps to achieve better classification accuracy and visual perceptual plausibility on the target domain. For attention consistency, we reduce the distance between the attention map of the flipped image and the flipped attention map of the original image to improve the model robustness to flipping. For attention discriminability, we reduce the overlaps of attention maps between the pseudo class and the most confusing class to eliminate visual confusion.
  • Figure 3: The characters in the right three columns exhibit a left-right mirrored orientation in comparison to those in the left three columns.
  • Figure 4: Examples of handprinted and scanned characters in Oracle-241.
  • Figure 5: Parameter sensitivity investigations of $\mu$ and $\lambda$ in terms of target accuracy on Oracle-241 dataset.
  • ...and 9 more figures