Table of Contents
Fetching ...

A Generic Hybrid Framework for 2D Visual Reconstruction

Daniel Rika, Dror Sholomon, Eli David, Alexandre Pais, Nathan S. Netanyahu

TL;DR

The paper tackles robust 2D visual reconstruction from fragmented square pieces by formulating it as a JPP with $N$ pieces of size $P \times P$ and introducing a generic hybrid pipeline. It combines a DL-based compatibility measure (DLCM) that analyzes entire piece content with an enhanced GA-based solver to optimize global placements, addressing real-world challenges such as degraded boundaries and unknown puzzle dimensions. The approach achieves state-of-the-art performance on large Type-1 and Type-2 puzzles, including Portuguese tile panels and eroded-boundary scenarios, and demonstrates strong generalization to synthetic JPPs and shredded documents. Practical implications include a scalable framework for archaeology, art restoration, and forensic reconstruction, with plans to reduce computational bottlenecks via embeddings and multi-threading.

Abstract

This paper presents a versatile hybrid framework for addressing 2D real-world reconstruction tasks formulated as jigsaw puzzle problems (JPPs) with square, non-overlapping pieces. Our approach integrates a deep learning (DL)-based compatibility measure (CM) model that evaluates pairs of puzzle pieces holistically, rather than focusing solely on their adjacent edges as traditionally done. This DL-based CM is paired with an optimized genetic algorithm (GA)-based solver, which iteratively searches for a global optimal arrangement using the pairwise CM scores of the puzzle pieces. Extensive experimental results highlight the framework's adaptability and robustness across multiple real-world domains. Notably, our unique hybrid methodology achieves state-of-the-art (SOTA) results in reconstructing Portuguese tile panels and large degraded puzzles with eroded boundaries.

A Generic Hybrid Framework for 2D Visual Reconstruction

TL;DR

The paper tackles robust 2D visual reconstruction from fragmented square pieces by formulating it as a JPP with pieces of size and introducing a generic hybrid pipeline. It combines a DL-based compatibility measure (DLCM) that analyzes entire piece content with an enhanced GA-based solver to optimize global placements, addressing real-world challenges such as degraded boundaries and unknown puzzle dimensions. The approach achieves state-of-the-art performance on large Type-1 and Type-2 puzzles, including Portuguese tile panels and eroded-boundary scenarios, and demonstrates strong generalization to synthetic JPPs and shredded documents. Practical implications include a scalable framework for archaeology, art restoration, and forensic reconstruction, with plans to reduce computational bottlenecks via embeddings and multi-threading.

Abstract

This paper presents a versatile hybrid framework for addressing 2D real-world reconstruction tasks formulated as jigsaw puzzle problems (JPPs) with square, non-overlapping pieces. Our approach integrates a deep learning (DL)-based compatibility measure (CM) model that evaluates pairs of puzzle pieces holistically, rather than focusing solely on their adjacent edges as traditionally done. This DL-based CM is paired with an optimized genetic algorithm (GA)-based solver, which iteratively searches for a global optimal arrangement using the pairwise CM scores of the puzzle pieces. Extensive experimental results highlight the framework's adaptability and robustness across multiple real-world domains. Notably, our unique hybrid methodology achieves state-of-the-art (SOTA) results in reconstructing Portuguese tile panels and large degraded puzzles with eroded boundaries.

Paper Structure

This paper contains 27 sections, 5 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: Reconstruction of 460-tile Portuguese panel with unknown piece orientation and panel dimensions using our proposed system: (a) Scrambled image of 460-piece panel, and (b) perfectly reconstructed panel achieved through our deep-learning compatibility measure (DLCM) and GA-based solver.
  • Figure 2: Artificially eroded boundaries of puzzle with 150 $64 \times 64$ square tiles: (a) Original image ($t=0$), (b) image with 2-pixel erosion layers ($t=2$), and (c) image with 4-pixel erosion layers ($t=4$).
  • Figure 3: Piece augmentation through degradation and shift: (a) Tile degraded by removing a 2-pixel frame, (b) tile shifted one pixel to the left and one pixel upward, and (c) combined augmentation of (a) and (b).
  • Figure 4: Sub-model architecture with input size $P \times 2P \times C$, where the piece size is $P \times P$ pixels and $C$ is the number of channels; the architecture includes four convolutional layers with $3 \times 3$ kernels and ReLU activation function; max pooling is applied after the second and third layers, and dropout with a probability of 0.25 is applied after all layers except the first; the final convolutional layer is flattened and passed through a fully-connected layer to compute the compatibility score without an activation function; no biases are used in any layer.
  • Figure 5: Our DLCM architecture, consisting of four sub-models: RGB-Net, Red-Net, Green-Net, and Blue-Net, with the same architecture as Figure \ref{['fig:atom_architecture']}; the DLCM output is the sum of the outputs of all four sub-models.
  • ...and 8 more figures