Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Aleksander Plocharski, Jan Swidzinski, Przemyslaw Musialski
TL;DR
Pro-DG tackles the challenge of editing architectural facades by tightly coupling a neuro-symbolic procedural grammar with diffusion-based synthesis. It reconstructs a facade’s procedural representation from an input image, applies user-driven structural edits, and guides the diffusion process through hierarchical symbol matching and controlled conditioning. The approach introduces a novel SVD-based structural similarity and a content-aware histogram metric to robustly align original and edited structures, enabling faithful, globally coherent edits while preserving architectural identity. Quantitative and qualitative evaluations, complemented by a user study, demonstrate improved edit adherence and identity preservation compared with baselines, highlighting the practical impact of integrating symbolic grammars with modern generative models for structured image editing.
Abstract
We present Pro-DG, a framework for procedurally controllable photo-realistic facade generation that combines a procedural shape grammar with diffusion-based image synthesis. Starting from a single input image, we reconstruct its facade layout using grammar rules, then edit that structure through user-defined transformations. As facades are inherently multi-hierarchical structures, we introduce hierarchical matching procedure that aligns facade structures at different levels which is used to introduce control maps to guide a generative diffusion pipeline. This approach retains local appearance fidelity while accommodating large-scale edits such as floor duplication or window rearrangement. We provide a thorough evaluation, comparing Pro-DG against inpainting-based baselines and synthetic ground truths. Our user study and quantitative measurements indicate improved preservation of architectural identity and higher edit accuracy. Our novel method is the first to integrate neuro-symbolically derived shape-grammars for modeling with modern generative model and highlights the broader potential of such approaches for precise and controllable image manipulation.
