Gaussian Heritage: 3D Digitization of Cultural Heritage with Integrated Object Segmentation
Mahtab Dahaghin, Myrna Castillo, Kourosh Riahidehkordi, Matteo Toso, Alessio Del Bue
TL;DR
This work tackles affordable 3D digitization of cultural heritage objects using only RGB imagery from consumer devices. It introduces a pipeline that combines 3D Gaussian Splatting with open-vocabulary segmentation via Grounding DINO and SAM, jointly learning geometry, appearance, and per-object segmentation through losses $\mathcal{L}_{rendering}$, $\lambda_{clustering}\mathcal{L}_{CC}$, and $\mathcal{L}_{reg}$, with per-Gaussian segmentation features in $\mathbb{R}^{16}$. The method enables instance-aware 3D reconstruction and automated object extraction, including a convex-hull refinement step ($gaussian\_grouping$) and multi-view rendering, and is released in a Dockerized pipeline. Evaluation on public datasets shows quantitative improvements in segmentation metrics (e.g., $mIoU$ and $mBIoU$) and qualitative demonstrations of accurate object-level 3D models, highlighting practical potential for museum-like deployments.
Abstract
The creation of digital replicas of physical objects has valuable applications for the preservation and dissemination of tangible cultural heritage. However, existing methods are often slow, expensive, and require expert knowledge. We propose a pipeline to generate a 3D replica of a scene using only RGB images (e.g. photos of a museum) and then extract a model for each item of interest (e.g. pieces in the exhibit). We do this by leveraging the advancements in novel view synthesis and Gaussian Splatting, modified to enable efficient 3D segmentation. This approach does not need manual annotation, and the visual inputs can be captured using a standard smartphone, making it both affordable and easy to deploy. We provide an overview of the method and baseline evaluation of the accuracy of object segmentation. The code is available at https://mahtaabdn.github.io/gaussian_heritage.github.io/.
