Multi-Unit Floor Plan Recognition and Reconstruction Using Improved Semantic Segmentation of Raster-Wise Floor Plans
Lukas Kratochvila, Gijs de Jong, Monique Arkesteijn, Simon Bilik, Tomas Zemcik, Karel Horak, Jan S. Rellermeyer
TL;DR
This work tackles the challenge of generating 3D building representations from raster 2D floor plans to enable scalable digital twins for emergency planning. It introduces two end-to-end recognition-reconstruction pipelines, CAB1 and CAB2, built on MDA-Unet and MACU-Net with asymmetric convolution, a dual-channel/spatial attention mechanism, and multi-scale skip connections, coupled with a multi-task training objective and a heatmap-based opening regression. The reconstruction stage converts segmentation masks into vector polygons and refined 3D models, achieving a mean F1 score of $0.86$ and IoU of $0.76$ on CubiCasa, outperforming state-of-the-art baselines, while remaining applicable across several datasets (R3D, CVC-FP, MLSTRUCT-FP, MURF). The approach provides a practical, publicly available pipeline for generating 3D floor-plan representations from raster data, enabling safer and more efficient emergency planning and urban simulations.
Abstract
Digital twins have a major potential to form a significant part of urban management in emergency planning, as they allow more efficient designing of the escape routes, better orientation in exceptional situations, and faster rescue intervention. Nevertheless, creating the twins still remains a largely manual effort, due to a lack of 3D-representations, which are available only in limited amounts for some new buildings. Thus, in this paper we aim to synthesize 3D information from commonly available 2D architectural floor plans. We propose two novel pixel-wise segmentation methods based on the MDA-Unet and MACU-Net architectures with improved skip connections, an attention mechanism, and a training objective together with a reconstruction part of the pipeline, which vectorizes the segmented plans to create a 3D model. The proposed methods are compared with two other state-of-the-art techniques and several benchmark datasets. On the commonly used CubiCasa benchmark dataset, our methods have achieved the mean F1 score of 0.86 over five examined classes, outperforming the other pixel-wise approaches tested. We have also made our code publicly available to support research in the field.
