CityDreamer: Compositional Generative Model of Unbounded 3D Cities
Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu
TL;DR
CityDreamer tackles unbounded 3D city generation by decoupling background scenery from building instances and employing BEV-based volumetric rendering. It introduces a City Background Generator with a generative hash grid and a Building Instance Generator with pixel-level local encoders and a style code, followed by a Compositor to fuse outputs into coherent city imagery. The approach is trained on CityGen datasets (OSM and GoogleEarth) to capture realistic layouts and appearances, achieving state-of-the-art results and enabling localized editing of individual buildings. This framework supports scalable, multi-view-consistent 3D city generation with practical applications in urban planning, gaming, and metaverse content creation.
Abstract
3D city generation is a desirable yet challenging task, since humans are more sensitive to structural distortions in urban environments. Additionally, generating 3D cities is more complex than 3D natural scenes since buildings, as objects of the same class, exhibit a wider range of appearances compared to the relatively consistent appearance of objects like trees in natural scenes. To address these challenges, we propose \textbf{CityDreamer}, a compositional generative model designed specifically for unbounded 3D cities. Our key insight is that 3D city generation should be a composition of different types of neural fields: 1) various building instances, and 2) background stuff, such as roads and green lands. Specifically, we adopt the bird's eye view scene representation and employ a volumetric render for both instance-oriented and stuff-oriented neural fields. The generative hash grid and periodic positional embedding are tailored as scene parameterization to suit the distinct characteristics of building instances and background stuff. Furthermore, we contribute a suite of CityGen Datasets, including OSM and GoogleEarth, which comprises a vast amount of real-world city imagery to enhance the realism of the generated 3D cities both in their layouts and appearances. CityDreamer achieves state-of-the-art performance not only in generating realistic 3D cities but also in localized editing within the generated cities.
