HouseLayout3D: A Benchmark and Training-Free Baseline for 3D Layout Estimation in the Wild
Valentin Bieri, Marie-Julie Rakotosaona, Keisuke Tateno, Francis Engelmann, Leonidas Guibas
TL;DR
<3-5 sentence high-level summary> This work introduces HouseLayout3D, a real-world benchmark designed for 3D layout estimation in large-scale multi-floor buildings, addressing the limitations of single-room synthetic datasets. It further proposes MultiFloor3D, a training-free baseline that combines modern reconstruction with a layout-prototype fitting strategy and outperforms prior methods on both HouseLayout3D and existing datasets. The dataset comprises 16 buildings, 33 levels, 317 rooms, with detailed CAD-style annotations for walls, floors, ceilings, stairs, doors, and windows derived from MP3D, enabling robust cross-floor reasoning. The results reveal significant gaps in current methods for building-scale layouts and highlight the potential for building-wide reasoning and 3D scene synthesis to advance perception and navigation tasks.
Abstract
Current 3D layout estimation models are primarily trained on synthetic datasets containing simple single room or single floor environments. As a consequence, they cannot natively handle large multi floor buildings and require scenes to be split into individual floors before processing, which removes global spatial context that is essential for reasoning about structures such as staircases that connect multiple levels. In this work, we introduce HouseLayout3D, a real world benchmark designed to support progress toward full building scale layout estimation, including multiple floors and architecturally intricate spaces. We also present MultiFloor3D, a simple training free baseline that leverages recent scene understanding methods and already outperforms existing 3D layout estimation models on both our benchmark and prior datasets, highlighting the need for further research in this direction. Data and code are available at: https://houselayout3d.github.io.
