Table of Contents
Fetching ...

ArcAid: Analysis of Archaeological Artifacts using Drawings

Offry Hayon, Stefan Münger, Ilan Shimshoni, Ayellet Tal

TL;DR

This work addresses the challenge of analyzing damaged archaeological artifacts with scarce labeled data by introducing a semi-supervised framework that leverages paired expert drawings during training to shape image embeddings. The model aligns image and drawing representations, supports multi-task learning by also generating drawings from images, and demonstrates strong gains in shape and period classification and retrieval across multiple backbones. A new dataset, CSSL, of paired images and drawings of stamp-seals, is introduced and released to the community, enabling robust evaluation of cross-modal archaeology methods. The approach shows the value of domain-specific sketches for improving perception in degraded visual data and offers a pathway to automatic documentation via image-to-drawing generation, with broader applicability to reliefs and similar artifacts.

Abstract

Archaeology is an intriguing domain for computer vision. It suffers not only from shortage in (labeled) data, but also from highly-challenging data, which is often extremely abraded and damaged. This paper proposes a novel semi-supervised model for classification and retrieval of images of archaeological artifacts. This model utilizes unique data that exists in the domain -- manual drawings made by special artists. These are used during training to implicitly transfer the domain knowledge from the drawings to their corresponding images, improving their classification results. We show that while learning how to classify, our model also learns how to generate drawings of the artifacts, an important documentation task, which is currently performed manually. Last but not least, we collected a new dataset of stamp-seals of the Southern Levant. Our code and dataset are publicly available.

ArcAid: Analysis of Archaeological Artifacts using Drawings

TL;DR

This work addresses the challenge of analyzing damaged archaeological artifacts with scarce labeled data by introducing a semi-supervised framework that leverages paired expert drawings during training to shape image embeddings. The model aligns image and drawing representations, supports multi-task learning by also generating drawings from images, and demonstrates strong gains in shape and period classification and retrieval across multiple backbones. A new dataset, CSSL, of paired images and drawings of stamp-seals, is introduced and released to the community, enabling robust evaluation of cross-modal archaeology methods. The approach shows the value of domain-specific sketches for improving perception in degraded visual data and offers a pathway to automatic documentation via image-to-drawing generation, with broader applicability to reliefs and similar artifacts.

Abstract

Archaeology is an intriguing domain for computer vision. It suffers not only from shortage in (labeled) data, but also from highly-challenging data, which is often extremely abraded and damaged. This paper proposes a novel semi-supervised model for classification and retrieval of images of archaeological artifacts. This model utilizes unique data that exists in the domain -- manual drawings made by special artists. These are used during training to implicitly transfer the domain knowledge from the drawings to their corresponding images, improving their classification results. We show that while learning how to classify, our model also learns how to generate drawings of the artifacts, an important documentation task, which is currently performed manually. Last but not least, we collected a new dataset of stamp-seals of the Southern Levant. Our code and dataset are publicly available.
Paper Structure (7 sections, 3 equations, 8 figures, 8 tables)

This paper contains 7 sections, 3 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: Input to training. The edges of the drawings are clear and complete, comparable to their counterpart in the images. Moreover, the details differ and the pairs are misaligned.
  • Figure 2: Training with and without drawings. When training only using images (top), the model focuses on the torso of the ibex, which leads to misclassification as a lion. Conversely, thanks to the drawing, our model focuses on the horns and the head (bottom), and classifies the image correctly.
  • Figure 3: Model. This figure illustrates the processing of a batch of image-drawing pairs ($3$ in this example), where some of them are unlabeled and some are. The image and its corresponding drawing are encoded, where $\theta_{Enc_{Draw}}$ and $\theta_{Enc_{Im}}$ represent the parameters of their encoders, respectively. The FC components represent the classifiers. The image decoder, whose parameters are $\theta_{Dec_{Im}}$, generates the reconstructed drawing. The loss $\mathcal{L}$ consists of three components: $\mathcal{L}_{CE}$ for classification, $\mathcal{L}_{Gen}$ for image-to-drawing generation and $\mathcal{L}_Sim$, whose goal is to maximize the similarity between pair embeddings. In case of an unlabeled pair, the classification component is ignored. Thus, we freeze $\theta_{Enc_{Draw}}$ and update the other components. In case of a labeled pair, $\theta_{Enc_{Draw}}$ is updated due to $\mathcal{L}_{CE}$; $\theta_{Enc_{Im}}$ is updated due to all the components of the loss function; and $\theta_{Dec_{Im}}$ is updated via $\mathcal{L}_{Gen}$.
  • Figure 4: Shape classes. (a) shows a single instance of an image-drawing pair for each of the $10$ classes. (b) is the number of labeled pairs.
  • Figure 5: Sub-period classes. All the findings are decorated by lions. They differ in shape and are dated to different periods (a) and sub-periods (b). The bottom row (c) shows the number of labeled pairs in each sub-period (not just for lions).
  • ...and 3 more figures