Table of Contents
Fetching ...

Automated Mosaic Tesserae Segmentation via Deep Learning Techniques

Charilaos Kapelonis, Marios Antonakakis, Konstantinos Politof, Aristomenis Antoniadis, Michalis Zervakis

TL;DR

The paper tackles automatic tesserae segmentation in mosaics to aid cultural heritage digitization. It fine-tunes the Segment Anything Model 2 on a newly annotated mosaic dataset, employing a Dice-based loss with calibrated confidence to optimize tesserae delineation. The approach yields clear improvements over the baseline and surpasses prior methods on a two-image benchmark, achieving IoU around 91% and a 0.02 count error, indicating strong reliability. By releasing a labeled mosaic dataset and training/configuration details, the work enables closer-to-real-time segmentation and lays groundwork for scalable digital preservation of mosaics.

Abstract

Art is widely recognized as a reflection of civilization and mosaics represent an important part of cultural heritage. Mosaics are an ancient art form created by arranging small pieces, called tesserae, on a surface using adhesive. Due to their age and fragility, they are prone to damage, highlighting the need for digital preservation. This paper addresses the problem of digitizing mosaics by segmenting the tesserae to separate them from the background within the broader field of Image Segmentation in Computer Vision. We propose a method leveraging Segment Anything Model 2 (SAM 2) by Meta AI, a foundation model that outperforms most conventional segmentation models, to automatically segment mosaics. Due to the limited open datasets in the field, we also create an annotated dataset of mosaic images to fine-tune and evaluate the model. Quantitative evaluation on our testing dataset shows notable improvements compared to the baseline SAM 2 model, with Intersection over Union increasing from 89.00% to 91.02% and Recall from 92.12% to 95.89%. Additionally, on a benchmark proposed by a prior approach, our model achieves an F-measure 3% higher than previous methods and reduces the error in the absolute difference between predicted and actual tesserae from 0.20 to just 0.02. The notable performance of the fine-tuned SAM 2 model together with the newly annotated dataset can pave the way for real-time segmentation of mosaic images.

Automated Mosaic Tesserae Segmentation via Deep Learning Techniques

TL;DR

The paper tackles automatic tesserae segmentation in mosaics to aid cultural heritage digitization. It fine-tunes the Segment Anything Model 2 on a newly annotated mosaic dataset, employing a Dice-based loss with calibrated confidence to optimize tesserae delineation. The approach yields clear improvements over the baseline and surpasses prior methods on a two-image benchmark, achieving IoU around 91% and a 0.02 count error, indicating strong reliability. By releasing a labeled mosaic dataset and training/configuration details, the work enables closer-to-real-time segmentation and lays groundwork for scalable digital preservation of mosaics.

Abstract

Art is widely recognized as a reflection of civilization and mosaics represent an important part of cultural heritage. Mosaics are an ancient art form created by arranging small pieces, called tesserae, on a surface using adhesive. Due to their age and fragility, they are prone to damage, highlighting the need for digital preservation. This paper addresses the problem of digitizing mosaics by segmenting the tesserae to separate them from the background within the broader field of Image Segmentation in Computer Vision. We propose a method leveraging Segment Anything Model 2 (SAM 2) by Meta AI, a foundation model that outperforms most conventional segmentation models, to automatically segment mosaics. Due to the limited open datasets in the field, we also create an annotated dataset of mosaic images to fine-tune and evaluate the model. Quantitative evaluation on our testing dataset shows notable improvements compared to the baseline SAM 2 model, with Intersection over Union increasing from 89.00% to 91.02% and Recall from 92.12% to 95.89%. Additionally, on a benchmark proposed by a prior approach, our model achieves an F-measure 3% higher than previous methods and reduces the error in the absolute difference between predicted and actual tesserae from 0.20 to just 0.02. The notable performance of the fine-tuned SAM 2 model together with the newly annotated dataset can pave the way for real-time segmentation of mosaic images.

Paper Structure

This paper contains 23 sections, 15 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Example of an annotated image: (a) original image, (b) binary mask.
  • Figure 2: Augmentations applied to a single example image (cropped).
  • Figure 3: (Left) A cropped, augmented training image. (Right) The corresponding raw mask logits, showing model confidence per pixel regarding tesserae classification.
  • Figure 4: Example comparison of the baseline SAM2_base and our fine-tuned SAM2_ft model on the example image of Fig. \ref{['fig:mosaic_and_mask']}. White pixels are correctly segmented by both models, green are correctly segmented only by SAM2_ft, red only by SAM2_base, while blue pixels are missed by both. Black pixels represent the background.
  • Figure 5: Segmentation of the Museum image with our method.