Scalable Cloud-Native Pipeline for Efficient 3D Model Reconstruction from Monocular Smartphone Images
Potito Aghilar, Vito Walter Anelli, Michelantonio Trizio, Tommaso Di Noia
TL;DR
The paper addresses scalable 3D reconstruction from monocular smartphone images by proposing a cloud-native, microservices-based pipeline tailored for Digital Twins in Industry 4.0. It integrates an ARCore-based pose recorder for data collection, CarveKit for alpha-masks, and NVIDIA nvdiffrec for mesh and texture reconstruction, all orchestrated in a Kubernetes environment with MinIO storage. Key contributions include a custom pose compensation algorithm using quaternions, a dataset preprocessing and conditioning flow, and a modular cloud architecture that separates data preprocessing, reconstruction, and scheduling tasks. The results demonstrate end-to-end feasibility with measurable latency components and reconstruction quality metrics, highlighting practical potential for rapid, scalable, phone-to-cloud 3D model generation and deployment in industrial training and visualization contexts.
Abstract
In recent years, 3D models have gained popularity in various fields, including entertainment, manufacturing, and simulation. However, manually creating these models can be a time-consuming and resource-intensive process, making it impractical for large-scale industrial applications. To address this issue, researchers are exploiting Artificial Intelligence and Machine Learning algorithms to automatically generate 3D models effortlessly. In this paper, we present a novel cloud-native pipeline that can automatically reconstruct 3D models from monocular 2D images captured using a smartphone camera. Our goal is to provide an efficient and easily-adoptable solution that meets the Industry 4.0 standards for creating a Digital Twin model, which could enhance personnel expertise through accelerated training. We leverage machine learning models developed by NVIDIA Research Labs alongside a custom-designed pose recorder with a unique pose compensation component based on the ARCore framework by Google. Our solution produces a reusable 3D model, with embedded materials and textures, exportable and customizable in any external 3D modelling software or 3D engine. Furthermore, the whole workflow is implemented by adopting the microservices architecture standard, enabling each component of the pipeline to operate as a standalone replaceable module.
