Segmentation of Maya hieroglyphs through fine-tuned foundation models

FNU Shivam; Megan Leight; Mary Kate Kelly; Claire Davis; Kelsey Clodfelter; Jacob Thrasher; Yenumula Reddy; Prashnna Gyawali

Segmentation of Maya hieroglyphs through fine-tuned foundation models

FNU Shivam, Megan Leight, Mary Kate Kelly, Claire Davis, Kelsey Clodfelter, Jacob Thrasher, Yenumula Reddy, Prashnna Gyawali

TL;DR

This work tackles segmenting Maya hieroglyph blocks, a challenging task due to noise and limited labeled data. It evaluates Segment Anything (SAM) as a baseline and introduces Finetuned-SAM by freezing the image and prompt encoders and training the mask decoder on expert-annotated glyph masks derived from Kerr's Maya Vase Database. The training employs a combined loss $L = \alpha L_{CE} + \beta L_{Dice}$ with equal weights, across $300$ epochs and $1\times 10^{-3}$ learning rate, achieving superior segmentation performance compared to off-the-shelf SAM and traditional baselines. The results demonstrate substantial gains in IoU and Dice metrics, validating the value of expert annotations and domain-specific fine-tuning for ancient-script segmentation, with plans to open-source the dataset for broader access and future research. This work advances the accessibility of Maya inscriptions to researchers and heritage communities, potentially aiding decipherment and public engagement.

Abstract

The study of Maya hieroglyphic writing unlocks the rich history of cultural and societal knowledge embedded within this ancient civilization's visual narrative. Artificial Intelligence (AI) offers a novel lens through which we can translate these inscriptions, with the potential to allow non-specialists access to reading these texts and to aid in the decipherment of those hieroglyphs which continue to elude comprehensive interpretation. Toward this, we leverage a foundational model to segment Maya hieroglyphs from an open-source digital library dedicated to Maya artifacts. Despite the initial promise of publicly available foundational segmentation models, their effectiveness in accurately segmenting Maya hieroglyphs was initially limited. Addressing this challenge, our study involved the meticulous curation of image and label pairs with the assistance of experts in Maya art and history, enabling the fine-tuning of these foundational models. This process significantly enhanced model performance, illustrating the potential of fine-tuning approaches and the value of our expanding dataset. We plan to open-source this dataset for encouraging future research, and eventually to help make the hieroglyphic texts legible to a broader community, particularly for Maya heritage community members.

Segmentation of Maya hieroglyphs through fine-tuned foundation models

TL;DR

with equal weights, across

epochs and

learning rate, achieving superior segmentation performance compared to off-the-shelf SAM and traditional baselines. The results demonstrate substantial gains in IoU and Dice metrics, validating the value of expert annotations and domain-specific fine-tuning for ancient-script segmentation, with plans to open-source the dataset for broader access and future research. This work advances the accessibility of Maya inscriptions to researchers and heritage communities, potentially aiding decipherment and public engagement.

Abstract

Paper Structure (12 sections, 1 equation, 3 figures, 2 tables)

This paper contains 12 sections, 1 equation, 3 figures, 2 tables.

Introduction
Dataset preparation and Pre-processing
Data Sources
Data Preparation
Methodology
Preliminary: Segment Anything (SAM)
Finetuned-SAM
Experiments and Results
Setup
Results
Conclusion
Acknowledgement

Figures (3)

Figure 1: Examples of Maya vessels from Kerr's Maya Vase Database (K505 and K3433). A blue box highlights the shape of the glyph block, although K505 is pseudoglyphic. The segmentation of these glyph blocks, to their individual hieroglyphic characters or symbols, is the primary focus of this study.
Figure 2: Binary mask (right) generated through manual annotation of glyphs from the original input image (left).
Figure 3: Qualitative analysis of the results from SAM and Finetuned-SAM for selected samples in our test dataset. For both SAM and Finetuned-SAM, we display the result when two random points are chosen as prompts (denoted by ).

Segmentation of Maya hieroglyphs through fine-tuned foundation models

TL;DR

Abstract

Segmentation of Maya hieroglyphs through fine-tuned foundation models

Authors

TL;DR

Abstract

Table of Contents

Figures (3)