Table of Contents
Fetching ...

Historical Printed Ornaments: Dataset and Tasks

Sayan Kumar Chaki, Zeynep Sonat Baltaci, Elliot Vincent, Remi Emonet, Fabienne Vial-Bonacci, Christelle Bahier-Porte, Mathieu Aubry, Thierry Fournel

TL;DR

The paper tackles analyzing historical printed ornaments with unsupervised computer vision, introducing the Rey's Ornaments dataset built from Marc-Michel Rey's books to study clustering, element discovery, and unsupervised change localization. It provides benchmarks and evaluates state-of-the-art methods across three tasks, including a synthetic pretraining pipeline for element discovery. Key findings show that simple baselines like k-means on pixel space can outperform sophisticated models on this data, while many unsupervised methods struggle with real historical variability and tight annotations. The work highlights the need for task-specific definitions of change and robust modeling of background and ink variations, offering a valuable dataset and codebase to spur further research in document and historical object analysis.

Abstract

This paper aims to develop the study of historical printed ornaments with modern unsupervised computer vision. We highlight three complex tasks that are of critical interest to book historians: clustering, element discovery, and unsupervised change localization. For each of these tasks, we introduce an evaluation benchmark, and we adapt and evaluate state-of-the-art models. Our Rey's Ornaments dataset is designed to be a representative example of a set of ornaments historians would be interested in. It focuses on an XVIIIth century bookseller, Marc-Michel Rey, providing a consistent set of ornaments with a wide diversity and representative challenges. Our results highlight the limitations of state-of-the-art models when faced with real data and show simple baselines such as k-means or congealing can outperform more sophisticated approaches on such data. Our dataset and code can be found at https://printed-ornaments.github.io/.

Historical Printed Ornaments: Dataset and Tasks

TL;DR

The paper tackles analyzing historical printed ornaments with unsupervised computer vision, introducing the Rey's Ornaments dataset built from Marc-Michel Rey's books to study clustering, element discovery, and unsupervised change localization. It provides benchmarks and evaluates state-of-the-art methods across three tasks, including a synthetic pretraining pipeline for element discovery. Key findings show that simple baselines like k-means on pixel space can outperform sophisticated models on this data, while many unsupervised methods struggle with real historical variability and tight annotations. The work highlights the need for task-specific definitions of change and robust modeling of background and ink variations, offering a valuable dataset and codebase to spur further research in document and historical object analysis.

Abstract

This paper aims to develop the study of historical printed ornaments with modern unsupervised computer vision. We highlight three complex tasks that are of critical interest to book historians: clustering, element discovery, and unsupervised change localization. For each of these tasks, we introduce an evaluation benchmark, and we adapt and evaluate state-of-the-art models. Our Rey's Ornaments dataset is designed to be a representative example of a set of ornaments historians would be interested in. It focuses on an XVIIIth century bookseller, Marc-Michel Rey, providing a consistent set of ornaments with a wide diversity and representative challenges. Our results highlight the limitations of state-of-the-art models when faced with real data and show simple baselines such as k-means or congealing can outperform more sophisticated approaches on such data. Our dataset and code can be found at https://printed-ornaments.github.io/.
Paper Structure (26 sections, 6 figures, 4 tables)

This paper contains 26 sections, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Our Rey's Ornaments dataset. Our dataset, based on ornaments found in the books published by or attributed to Marc-Michel Rey (1720-1780), focuses on three unsupervised computer vision tasks that are of interest to book historians: (a) image clustering of ornaments printed using wooden blocks, (b) unsupervised element discovery in composite ornaments printed using multiple types of vignettes, and (c) unsupervised change localization in vignette series.
  • Figure 2: Examples of composite ornaments from our synthetic dataset.
  • Figure 3: Qualitative results for clustering. Although clusters obtained with DTI Clustering are often valid (a), there are failure cases (b) due to e.g. similar vignettes (top), or split clusters (bottom). Results are qualitatively similar when using k-means with pre-trained feature extractors.
  • Figure 4: Qualitative results for element discovery. We show the dataset images (a) with their semantic ground truth bounding boxes (b) and the reconstruction and predicted semantic bounding boxed from different models (c-g). For all the methods, we show the results of the models pre-trained on the synthetic dataset and fine-tuned on the real dataset, and the semantic boxes obtained using k-means on CLIP features.
  • Figure 5: Qualitative results for change localization. For randomly selected vignettes, we show (a) an example normal vignette as well as (b) the changed vignette with (c) the corresponding ground truth change mask (GT). For each method (d-g), we show the predicted difference image.
  • ...and 1 more figures