COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation
Liu He, Daniel Aliaga
TL;DR
COHO introduces a context-sensitive, city-scale urban layout generator built on a canonical graph representation and a Graph-based Masked AutoEncoder (GMAE). By encoding blocks, buildings, and communities into a unified graph and applying self-supervised masking with priority-based iterative sampling, it achieves realistic, semantically consistent 2.5D layouts across 330 US cities. It outperforms baselines on context-awareness and realism while offering fast inference and auxiliary capabilities, such as socio-economic metric prediction and semantic manipulation. The work provides an open dataset and code to enable scalable urban layout synthesis for planning, digital twins, and content creation, with potential extensions to 3D city modeling and multi-view synthesis.
Abstract
The generation of large-scale urban layouts has garnered substantial interest across various disciplines. Prior methods have utilized procedural generation requiring manual rule coding or deep learning needing abundant data. However, prior approaches have not considered the context-sensitive nature of urban layout generation. Our approach addresses this gap by leveraging a canonical graph representation for the entire city, which facilitates scalability and captures the multi-layer semantics inherent in urban layouts. We introduce a novel graph-based masked autoencoder (GMAE) for city-scale urban layout generation. The method encodes attributed buildings, city blocks, communities and cities into a unified graph structure, enabling self-supervised masked training for graph autoencoder. Additionally, we employ scheduled iterative sampling for 2.5D layout generation, prioritizing the generation of important city blocks and buildings. Our approach achieves good realism, semantic consistency, and correctness across the heterogeneous urban styles in 330 US cities. Codes and datasets are released at https://github.com/Arking1995/COHO.
