Table of Contents
Fetching ...

One-to-many Reconstruction of 3D Geometry of cultural Artifacts using a synthetically trained Generative Model

Thomas Pöllabauer, Julius Kühn, Jiayi Li, Arjan Kuijper

TL;DR

The paper tackles reconstructing 3D geometry of cultural artifacts from a single, low-quality sketch, addressing data scarcity in heritage domains. It presents a fully automated pipeline that uses synthetic data and diffusion-based generation (PITI) to produce depth maps, normal maps, and complete 3D geometry, with optional multi-view outputs. Key contributions include a literature survey focusing on one-to-many 3D reconstruction in cultural heritage, demonstration of diffusion and GAN-based approaches for this domain, and an interactive authoring tool for experts. The approach enables interactive exploration and reconstruction of lost artifacts from minimal input, with results validating viability on sinopia-like sketches and highlighting future work on multi-source inputs and user studies.

Abstract

Estimating the 3D shape of an object using a single image is a difficult problem. Modern approaches achieve good results for general objects, based on real photographs, but worse results on less expressive representations such as historic sketches. Our automated approach generates a variety of detailed 3D representation from a single sketch, depicting a medieval statue, and can be guided by multi-modal inputs, such as text prompts. It relies solely on synthetic data for training, making it adoptable even in cases of only small numbers of training examples. Our solution allows domain experts such as a curators to interactively reconstruct potential appearances of lost artifacts.

One-to-many Reconstruction of 3D Geometry of cultural Artifacts using a synthetically trained Generative Model

TL;DR

The paper tackles reconstructing 3D geometry of cultural artifacts from a single, low-quality sketch, addressing data scarcity in heritage domains. It presents a fully automated pipeline that uses synthetic data and diffusion-based generation (PITI) to produce depth maps, normal maps, and complete 3D geometry, with optional multi-view outputs. Key contributions include a literature survey focusing on one-to-many 3D reconstruction in cultural heritage, demonstration of diffusion and GAN-based approaches for this domain, and an interactive authoring tool for experts. The approach enables interactive exploration and reconstruction of lost artifacts from minimal input, with results validating viability on sinopia-like sketches and highlighting future work on multi-source inputs and user studies.

Abstract

Estimating the 3D shape of an object using a single image is a difficult problem. Modern approaches achieve good results for general objects, based on real photographs, but worse results on less expressive representations such as historic sketches. Our automated approach generates a variety of detailed 3D representation from a single sketch, depicting a medieval statue, and can be guided by multi-modal inputs, such as text prompts. It relies solely on synthetic data for training, making it adoptable even in cases of only small numbers of training examples. Our solution allows domain experts such as a curators to interactively reconstruct potential appearances of lost artifacts.
Paper Structure (10 sections, 3 figures)

This paper contains 10 sections, 3 figures.

Figures (3)

  • Figure 1: The proposed approach consists of the following components: (a) Input data, the original sinopia images. (b) Input image pre-processing with OpenCV. (c) Image inpainting to restore the lines and repair holes. (d) Inpainted images serve as input for the diffusion model. (e) We use the PITI diffusion approach for further generation. (f) Depth map and normals are optained by the diffusion outputs. (g) An optional generation network is proposed to create more views to reconstruct multi-perspective 3D models. (h) Point clouds and 3D shapes are generated from the depth and normal maps, and can be used for full geometry reconstruction.
  • Figure 2: In- and output image pairs before and after the automated image pre-processing.
  • Figure 3: Preliminary 3D reconstrution results