HABD: a houma alliance book ancient handwritten character recognition database
Xiaoyu Yuan, Xiaohua Huang, Zibo Zhang, Yabo Sun
TL;DR
The paper tackles recognizing ancient Houma Alliance Book characters under data scarcity and long-tail distribution by constructing the HABD dataset with 26,732 samples across 327 classes via iterative annotation. It benchmarks deep learning methods using four architectures (AlexNet, ResNet, ViT, CrossViT) and introduces Mixup-based long-tail augmentation plus decision-level classifier fusion, achieving a best fusion accuracy of 95.21% (DLF-RVC). Baseline results show Transformers outperform CNNs, and Mixup improves generalization especially for tail classes, illustrating the value of long-tail strategies in historical-script recognition. The work provides a publicly useful resource for cultural heritage digitization and sets a foundation for broader application to ancient scripts and preservation of historical literature.
Abstract
The Houma Alliance Book, one of history's earliest calligraphic examples, was unearthed in the 1970s. These artifacts were meticulously organized, reproduced, and copied by the Shanxi Provincial Institute of Cultural Relics. However, because of their ancient origins and severe ink erosion, identifying characters in the Houma Alliance Book is challenging, necessitating the use of digital technology. In this paper, we propose a new ancient handwritten character recognition database for the Houma alliance book, along with a novel benchmark based on deep learning architectures. More specifically, a collection of 26,732 characters samples from the Houma Alliance Book were gathered, encompassing 327 different types of ancient characters through iterative annotation. Furthermore, benchmark algorithms were proposed by combining four deep neural network classifiers with two data augmentation methods. This research provides valuable resources and technical support for further studies on the Houma Alliance Book and other ancient characters. This contributes to our understanding of ancient culture and history, as well as the preservation and inheritance of humanity's cultural heritage.
