Table of Contents
Fetching ...

Historical Astronomical Diagrams Decomposition in Geometric Primitives

Syrine Kalleli, Scott Trigg, Ségolène Albouy, Mathieu Husson, Mathieu Aubry

TL;DR

This work introduces a unique dataset of 303 astronomical diagrams from diverse traditions, ranging from the XIIth to the XVIIIth century, annotated with more than 3000 line segments, circles and arcs, and develops a model that builds on DINO-DETR to enable the prediction of multiple geometric primitives.

Abstract

Automatically extracting the geometric content from the hundreds of thousands of diagrams drawn in historical manuscripts would enable historians to study the diffusion of astronomical knowledge on a global scale. However, state-of-the-art vectorization methods, often designed to tackle modern data, are not adapted to the complexity and diversity of historical astronomical diagrams. Our contribution is thus twofold. First, we introduce a unique dataset of 303 astronomical diagrams from diverse traditions, ranging from the XIIth to the XVIIIth century, annotated with more than 3000 line segments, circles and arcs. Second, we develop a model that builds on DINO-DETR to enable the prediction of multiple geometric primitives. We show that it can be trained solely on synthetic data and accurately predict primitives on our challenging dataset. Our approach widely improves over the LETR baseline, which is restricted to lines, by introducing a meaningful parametrization for multiple primitives, jointly training for detection and parameter refinement, using deformable attention and training on rich synthetic data. Our dataset and code are available on our webpage.

Historical Astronomical Diagrams Decomposition in Geometric Primitives

TL;DR

This work introduces a unique dataset of 303 astronomical diagrams from diverse traditions, ranging from the XIIth to the XVIIIth century, annotated with more than 3000 line segments, circles and arcs, and develops a model that builds on DINO-DETR to enable the prediction of multiple geometric primitives.

Abstract

Automatically extracting the geometric content from the hundreds of thousands of diagrams drawn in historical manuscripts would enable historians to study the diffusion of astronomical knowledge on a global scale. However, state-of-the-art vectorization methods, often designed to tackle modern data, are not adapted to the complexity and diversity of historical astronomical diagrams. Our contribution is thus twofold. First, we introduce a unique dataset of 303 astronomical diagrams from diverse traditions, ranging from the XIIth to the XVIIIth century, annotated with more than 3000 line segments, circles and arcs. Second, we develop a model that builds on DINO-DETR to enable the prediction of multiple geometric primitives. We show that it can be trained solely on synthetic data and accurately predict primitives on our challenging dataset. Our approach widely improves over the LETR baseline, which is restricted to lines, by introducing a meaningful parametrization for multiple primitives, jointly training for detection and parameter refinement, using deformable attention and training on rich synthetic data. Our dataset and code are available on our webpage.
Paper Structure (30 sections, 5 equations, 11 figures, 1 table)

This paper contains 30 sections, 5 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Goal. We perform historical astronomical diagram vectorization by predicting simple geometric primitives, such as lines, circles, and arcs, through a transformer encoder-decoder model. Our modified decoder queries, which we refer to as a primitive queries, are associated to different geometric primitives.
  • Figure 2: Model architecture. Given an input image, the backbone extracts multi-scale features which are fed to the Transformer encoder along with a positional encoding. The primitive queries, composed of content (filled) and modified positional (empty) queries, go through the Transformer decoder where they probe the enhanced encoder features through deformable cross-attention. Queries are refined layer-by-layer in the decoder, to finally predict the primitive class, bounding box and parameters.
  • Figure 3: Modified Positional Queries. All coordinates are normalized with respect to the image size. The bounding box is defined by its center and size. Lines are defined by endpoints, circles by center and radius (normalized by image width, $r_x$, and by image height, $r_y$), and arcs by endpoints and midpoint.
  • Figure 4: Primitive refinement. Each modified decoder block (left) updates the positional query $(b_{}^l, g_{}^l)$, which progressively refines predictions (right).
  • Figure 5: Dataset characteristics. Our diagrams have varying complexity and several challenges are introduced by document deterioration, heavy presence of text, and overlapping primitives.
  • ...and 6 more figures