Gen3DSR: Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View
Andreea Ardelean, Mert Özer, Bernhard Egger
TL;DR
Gen3DSR introduces a modular divide-and-conquer pipeline for reconstructing 3D scenes from a single image without end-to-end 3D supervision. The method first analyzes the scene holistically to produce depth, camera, and segmentation information, then reconstructs each object with a diffusion-prior-based single-view method enhanced by amodal completion, followed by assembling the results into a coherent scene and modeling the background. Its key contributions are a compositional framework that can be incrementally improved by swapping modules, a learned amodal completion component, and a robust reprojection/linking strategy that aligns object reconstructions to scene depth. Empirically, Gen3DSR achieves competitive or superior results on synthetic 3D-FRONT and real HOPE-Image data, including challenging real-world scenes, while maintaining zero-shot generalization.
Abstract
Single-view 3D reconstruction is currently approached from two dominant perspectives: reconstruction of scenes with limited diversity using 3D data supervision or reconstruction of diverse singular objects using large image priors. However, real-world scenarios are far more complex and exceed the capabilities of these methods. We therefore propose a hybrid method following a divide-and-conquer strategy. We first process the scene holistically, extracting depth and semantic information, and then leverage an object-level method for the detailed reconstruction of individual components. By splitting the problem into simpler tasks, our system is able to generalize to various types of scenes without retraining or fine-tuning. We purposely design our pipeline to be highly modular with independent, self-contained modules, to avoid the need for end-to-end training of the whole system. This enables the pipeline to naturally improve as future methods can replace the individual modules. We demonstrate the reconstruction performance of our approach on both synthetic and real-world scenes, comparing favorable against prior works. Project page: https://andreeadogaru.github.io/Gen3DSR
