Table of Contents
Fetching ...

ReassembleNet: Learnable Keypoints and Diffusion for 2D Fresco Reconstruction

Adeela Islam, Stefano Fiorini, Stuart James, Pietro Morerio, Alessio Del Bue

TL;DR

ReassembleNet tackles the challenging problem of 2D fresco reassembly by representing irregular fragments with learnable contour keypoints and enriching them with multimodal geometric and texture features. The method couples a learnable keypoint selector, graph-based attention, and a diffusion-based pose estimator to iteratively refine piece translations and rotations, supported by pretraining on a semi-synthetic dataset to bridge sim-to-real gaps. On the RePAIR benchmark, it achieves substantial improvements in RMSE for rotation and translation compared to prior methods, and demonstrates scalability in memory usage and keypoint configurations. This work advances practical, data-efficient reassembly for real-world artifacts, enabling more reliable automatic reconstruction in archaeology and related fields.

Abstract

The task of reassembly is a significant challenge across multiple domains, including archaeology, genomics, and molecular docking, requiring the precise placement and orientation of elements to reconstruct an original structure. In this work, we address key limitations in state-of-the-art Deep Learning methods for reassembly, namely i) scalability; ii) multimodality; and iii) real-world applicability: beyond square or simple geometric shapes, realistic and complex erosion, or other real-world problems. We propose ReassembleNet, a method that reduces complexity by representing each input piece as a set of contour keypoints and learning to select the most informative ones by Graph Neural Networks pooling inspired techniques. ReassembleNet effectively lowers computational complexity while enabling the integration of features from multiple modalities, including both geometric and texture data. Further enhanced through pretraining on a semi-synthetic dataset. We then apply diffusion-based pose estimation to recover the original structure. We improve on prior methods by 57% and 87% for RMSE Rotation and Translation, respectively.

ReassembleNet: Learnable Keypoints and Diffusion for 2D Fresco Reconstruction

TL;DR

ReassembleNet tackles the challenging problem of 2D fresco reassembly by representing irregular fragments with learnable contour keypoints and enriching them with multimodal geometric and texture features. The method couples a learnable keypoint selector, graph-based attention, and a diffusion-based pose estimator to iteratively refine piece translations and rotations, supported by pretraining on a semi-synthetic dataset to bridge sim-to-real gaps. On the RePAIR benchmark, it achieves substantial improvements in RMSE for rotation and translation compared to prior methods, and demonstrates scalability in memory usage and keypoint configurations. This work advances practical, data-efficient reassembly for real-world artifacts, enabling more reliable automatic reconstruction in archaeology and related fields.

Abstract

The task of reassembly is a significant challenge across multiple domains, including archaeology, genomics, and molecular docking, requiring the precise placement and orientation of elements to reconstruct an original structure. In this work, we address key limitations in state-of-the-art Deep Learning methods for reassembly, namely i) scalability; ii) multimodality; and iii) real-world applicability: beyond square or simple geometric shapes, realistic and complex erosion, or other real-world problems. We propose ReassembleNet, a method that reduces complexity by representing each input piece as a set of contour keypoints and learning to select the most informative ones by Graph Neural Networks pooling inspired techniques. ReassembleNet effectively lowers computational complexity while enabling the integration of features from multiple modalities, including both geometric and texture data. Further enhanced through pretraining on a semi-synthetic dataset. We then apply diffusion-based pose estimation to recover the original structure. We improve on prior methods by 57% and 87% for RMSE Rotation and Translation, respectively.

Paper Structure

This paper contains 29 sections, 9 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: We introduce ReassembleNet as a method for fresco reassembly. Our ReassembleNet addresses key challenges that have been ignored by traditional methods, including complex geometry, texture and erosion, missing pieces, often in a data-scarcity scenario.
  • Figure 2: Framework of our proposed ReassembleNet. We begin by extracting keypoints from the input pieces, followed by computing global and local texture features alongside geometric features. Using the geometric features and keypoint coordinates, we then select the most relevant $k$ keypoints. To model the reassembly process, we employ a Diffusion Probabilistic Model, formulating a Markov chain that gradually injects noise into the keypoints’ positions and orientations. At timestep $t = 0$, the pieces are correctly aligned, whereas at timestep $t = T$, their keypoints are randomly translated and rotated (note that for visualization purpose, we compute the average translation and rotation of keypoints within each piece at every step in the chain). At each timestep $t$, our attention module processes the keypoints—incorporating their coordinates, orientations, and extracted features—to predict a less noisy version of their positions and orientations, $\{\hat{X}_{t-1}^m\}_{m=1}^M$, iteratively refining them toward the correct configuration.
  • Figure 3: An illustration of the Learnable KeyPoint Selector Module where keypoints are projected into a high dimensional space, then use a graph transformer to predict scores which is then used to identify the top-$k$ and pooled to identify the most important keypoints.
  • Figure 4: Qualitative results on RePAIR, showing the reassembly outcomes on four frescoes.
  • Figure 5: GPU memory consumption as a function of the number of puzzle pieces on RePAIR dataset.
  • ...and 3 more figures