NeuroPapyri: A Deep Attention Embedding Network for Handwritten Papyri Retrieval
Giuseppe De Gregorio, Simon Perrin, Rodrigo C. G. Pena, Isabelle Marthot-Santaniello, Harold Mouchère
TL;DR
This paper addresses the challenge of retrieving historical papyri images with interpretability by introducing NeuroPapyri, a CNN-plus-multi-head-attention embedding network for handwritten Greek papyri. It trains a dual loss via a weighted combination $Loss = w_1 Loss_A + w_2 Loss_T, w_1+w_2=1$ to produce discriminative embeddings and visible attention maps, enabling paleographers to understand model decisions. Across synthetic AL-PUBv2 and ICDAR2023 Iliad-based datasets, NeuroPapyri demonstrates strong character-identification performance and state-of-the-art document retrieval (Top-1 accuracy up to 96.57% and F1@1 around 94.00). The attention visualizations support interpretability and collaboration between historians and computer scientists, with potential extensions to writer identification and dating of papyri in future work.
Abstract
The intersection of computer vision and machine learning has emerged as a promising avenue for advancing historical research, facilitating a more profound exploration of our past. However, the application of machine learning approaches in historical palaeography is often met with criticism due to their perceived ``black box'' nature. In response to this challenge, we introduce NeuroPapyri, an innovative deep learning-based model specifically designed for the analysis of images containing ancient Greek papyri. To address concerns related to transparency and interpretability, the model incorporates an attention mechanism. This attention mechanism not only enhances the model's performance but also provides a visual representation of the image regions that significantly contribute to the decision-making process. Specifically calibrated for processing images of papyrus documents with lines of handwritten text, the model utilizes individual attention maps to inform the presence or absence of specific characters in the input image. This paper presents the NeuroPapyri model, including its architecture and training methodology. Results from the evaluation demonstrate NeuroPapyri's efficacy in document retrieval, showcasing its potential to advance the analysis of historical manuscripts.
