Segmentation of Maya hieroglyphs through fine-tuned foundation models
FNU Shivam, Megan Leight, Mary Kate Kelly, Claire Davis, Kelsey Clodfelter, Jacob Thrasher, Yenumula Reddy, Prashnna Gyawali
TL;DR
This work tackles segmenting Maya hieroglyph blocks, a challenging task due to noise and limited labeled data. It evaluates Segment Anything (SAM) as a baseline and introduces Finetuned-SAM by freezing the image and prompt encoders and training the mask decoder on expert-annotated glyph masks derived from Kerr's Maya Vase Database. The training employs a combined loss $L = \alpha L_{CE} + \beta L_{Dice}$ with equal weights, across $300$ epochs and $1\times 10^{-3}$ learning rate, achieving superior segmentation performance compared to off-the-shelf SAM and traditional baselines. The results demonstrate substantial gains in IoU and Dice metrics, validating the value of expert annotations and domain-specific fine-tuning for ancient-script segmentation, with plans to open-source the dataset for broader access and future research. This work advances the accessibility of Maya inscriptions to researchers and heritage communities, potentially aiding decipherment and public engagement.
Abstract
The study of Maya hieroglyphic writing unlocks the rich history of cultural and societal knowledge embedded within this ancient civilization's visual narrative. Artificial Intelligence (AI) offers a novel lens through which we can translate these inscriptions, with the potential to allow non-specialists access to reading these texts and to aid in the decipherment of those hieroglyphs which continue to elude comprehensive interpretation. Toward this, we leverage a foundational model to segment Maya hieroglyphs from an open-source digital library dedicated to Maya artifacts. Despite the initial promise of publicly available foundational segmentation models, their effectiveness in accurately segmenting Maya hieroglyphs was initially limited. Addressing this challenge, our study involved the meticulous curation of image and label pairs with the assistance of experts in Maya art and history, enabling the fine-tuning of these foundational models. This process significantly enhanced model performance, illustrating the potential of fine-tuning approaches and the value of our expanding dataset. We plan to open-source this dataset for encouraging future research, and eventually to help make the hieroglyphic texts legible to a broader community, particularly for Maya heritage community members.
