ARTeFACT: Benchmarking Segmentation Models on Diverse Analogue Media Damage
Daniela Ivanova, Marco Aversa, Paul Henderson, John Williamson
TL;DR
This paper tackles robust damage detection in analogue media for cultural heritage preservation by introducing ARTeFACT, a diverse dataset with pixel-accurate damage masks for 15 damage types across 10 materials and 4 content categories, plus textual descriptions. It performs an extensive benchmark of zero-shot, supervised, unsupervised, and text-guided segmentation methods—including SAM, SegFormer, UPerNet variants, DINOv2, and diffusion-based approaches—to evaluate cross-media generalization. The results reveal substantial generalization gaps: no method consistently detects damage across all media types, with SAM requiring impractical prompt engineering, supervised models struggling with multiclass tasks, and diffusion-based methods offering only limited precision. The work provides the first-of-its-kind, publicly available benchmark and taxonomy for damaged analogue media, highlighting the need for new, robust damage-detection pipelines in conservation practice.
Abstract
Accurately detecting and classifying damage in analogue media such as paintings, photographs, textiles, mosaics, and frescoes is essential for cultural heritage preservation. While machine learning models excel in correcting degradation if the damage operator is known a priori, we show that they fail to robustly predict where the damage is even after supervised training; thus, reliable damage detection remains a challenge. Motivated by this, we introduce ARTeFACT, a dataset for damage detection in diverse types analogue media, with over 11,000 annotations covering 15 kinds of damage across various subjects, media, and historical provenance. Furthermore, we contribute human-verified text prompts describing the semantic contents of the images, and derive additional textual descriptions of the annotated damage. We evaluate CNN, Transformer, diffusion-based segmentation models, and foundation vision models in zero-shot, supervised, unsupervised and text-guided settings, revealing their limitations in generalising across media types. Our dataset is available at $\href{https://daniela997.github.io/ARTeFACT/}{https://daniela997.github.io/ARTeFACT/}$ as the first-of-its-kind benchmark for analogue media damage detection and restoration.
