Table of Contents
Fetching ...

FacaDiffy: Inpainting Unseen Facade Parts Using Diffusion Models

Thomas Froech, Olaf Wysocki, Yan Xia, Junyu Xie, Benedikt Schwab, Daniel Cremers, Thomas H. Kolbe

TL;DR

FacaDiffy tackles the problem of incomplete 2D facade conflict maps by combining a deterministic ray-casting pipeline with a personalized diffusion-based inpainting approach. It first computes conflict maps from existing 3D models and laser scans, then augments training data with synthetic conflict maps from random city models to personalize a Stable Diffusion inpainting model using DreamBooth. The method demonstrates state-of-the-art performance in conflict-map completion, yields notable improvements in LoD3 reconstruction detection rates, and offers a scalable path for deploying facade inpainting in real-world city-model pipelines. The work provides a practical workflow for enhancing semantic 3D building reconstruction through targeted, dataset-efficient personalization of generative models.

Abstract

High-detail semantic 3D building models are frequently utilized in robotics, geoinformatics, and computer vision. One key aspect of creating such models is employing 2D conflict maps that detect openings' locations in building facades. Yet, in reality, these maps are often incomplete due to obstacles encountered during laser scanning. To address this challenge, we introduce FacaDiffy, a novel method for inpainting unseen facade parts by completing conflict maps with a personalized Stable Diffusion model. Specifically, we first propose a deterministic ray analysis approach to derive 2D conflict maps from existing 3D building models and corresponding laser scanning point clouds. Furthermore, we facilitate the inpainting of unseen facade objects into these 2D conflict maps by leveraging the potential of personalizing a Stable Diffusion model. To complement the scarcity of real-world training data, we also develop a scalable pipeline to produce synthetic conflict maps using random city model generators and annotated facade images. Extensive experiments demonstrate that FacaDiffy achieves state-of-the-art performance in conflict map completion compared to various inpainting baselines and increases the detection rate by $22\%$ when applying the completed conflict maps for high-definition 3D semantic building reconstruction. The code is be publicly available in the corresponding GitHub repository: https://github.com/ThomasFroech/InpaintingofUnseenFacadeObjects

FacaDiffy: Inpainting Unseen Facade Parts Using Diffusion Models

TL;DR

FacaDiffy tackles the problem of incomplete 2D facade conflict maps by combining a deterministic ray-casting pipeline with a personalized diffusion-based inpainting approach. It first computes conflict maps from existing 3D models and laser scans, then augments training data with synthetic conflict maps from random city models to personalize a Stable Diffusion inpainting model using DreamBooth. The method demonstrates state-of-the-art performance in conflict-map completion, yields notable improvements in LoD3 reconstruction detection rates, and offers a scalable path for deploying facade inpainting in real-world city-model pipelines. The work provides a practical workflow for enhancing semantic 3D building reconstruction through targeted, dataset-efficient personalization of generative models.

Abstract

High-detail semantic 3D building models are frequently utilized in robotics, geoinformatics, and computer vision. One key aspect of creating such models is employing 2D conflict maps that detect openings' locations in building facades. Yet, in reality, these maps are often incomplete due to obstacles encountered during laser scanning. To address this challenge, we introduce FacaDiffy, a novel method for inpainting unseen facade parts by completing conflict maps with a personalized Stable Diffusion model. Specifically, we first propose a deterministic ray analysis approach to derive 2D conflict maps from existing 3D building models and corresponding laser scanning point clouds. Furthermore, we facilitate the inpainting of unseen facade objects into these 2D conflict maps by leveraging the potential of personalizing a Stable Diffusion model. To complement the scarcity of real-world training data, we also develop a scalable pipeline to produce synthetic conflict maps using random city model generators and annotated facade images. Extensive experiments demonstrate that FacaDiffy achieves state-of-the-art performance in conflict map completion compared to various inpainting baselines and increases the detection rate by when applying the completed conflict maps for high-definition 3D semantic building reconstruction. The code is be publicly available in the corresponding GitHub repository: https://github.com/ThomasFroech/InpaintingofUnseenFacadeObjects

Paper Structure

This paper contains 35 sections, 1 equation, 6 figures, 5 tables.

Figures (6)

  • Figure 1: The conflict maps are prone to occlusions (black) owing to the reliance on the ray-to-model (yellow) analysis. We complement conflicts (red) and confirmations (green) by employing the proposed FacaDiffy.
  • Figure 2: Schematic overview of FacaDiffy. By combining an existing LoD2 building model and corresponding laser scanning point clouds, we formulate a deterministic method based on ray-casting analysis to obtain incomplete conflict maps and corresponding binary inpainting masks (top branch). We generate synthetic conflict maps to personalize the Stable Diffusion model (bottom branch), which is employed for inpainting given the partial evidence of deterministic conflict maps. These can be utilized for various downstream applications such as accurate LoD3 reconstructions, facade solar potential analysis, etc
  • Figure 3: Schematic overview of the conflict determination in the ray casting approach with the viewpoint $\mathbf{v}$, the point $\mathbf{p}$, and the tolerance $\pm t$. Three distinct scenarios are illustrated: (a) unknown; (b) confirming; (c) conflicting.
  • Figure 4: Exemplary inpainting results on a real conflict map. The similarity between ground-truth (LoD3) and inpainted results is measured in terms of SSIM (blue), IoU (brown), and LPIPS (purple). The conflict maps are color-coded with the conflicting (red), confirming (green), and unknown/masked (black) areas.
  • Figure 5: Counterintuitive IoU evaluation result for a randomly masked conflict map derived from the CMP-database of annotated images. While our method performs better qualitatively, the IoU yields counterintuitive results.
  • ...and 1 more figures