Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata
Dongsu Zhang, Francis Williams, Zan Gojcic, Karsten Kreis, Sanja Fidler, Young Min Kim, Amlan Kar
TL;DR
This work tackles the challenge of generating high-fidelity, large-scale outdoor 3D scenes from sparse LiDAR data for simulation purposes. It introduces hierarchical Generative Cellular Automata (hGCA), a two-stage, coarse-to-fine model where a coarse GCA conditioned by a light-weight BEV planner creates a global, low-resolution layout, followed by a high-resolution upsampling stage using a continuous GCA with local implicit surfaces to produce a detailed mesh. Across synthetic datasets and real-world Waymo data, hGCA demonstrates superior extrapolation fidelity and sim-to-real generalization relative to semantic scene completion and indoor-scene baselines, while also producing novel content guided by geometric cues. The approach promises scalable, simulation-ready environment generation from AV sensing data, though it currently omits textures and has slower runtimes, signaling avenues for future optimization and realism enhancements.
Abstract
We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV). Contrary to prior work on AV scene completion, we aim to extrapolate fine geometry from unlabeled and beyond spatial limits of LiDAR scans, taking a step towards generating realistic, high-resolution simulation-ready 3D street environments. We propose hierarchical Generative Cellular Automata (hGCA), a spatially scalable conditional 3D generative model, which grows geometry recursively with local kernels following, in a coarse-to-fine manner, equipped with a light-weight planner to induce global consistency. Experiments on synthetic scenes show that hGCA generates plausible scene geometry with higher fidelity and completeness compared to state-of-the-art baselines. Our model generalizes strongly from sim-to-real, qualitatively outperforming baselines on the Waymo-open dataset. We also show anecdotal evidence of the ability to create novel objects from real-world geometric cues even when trained on limited synthetic content. More results and details can be found on https://research.nvidia.com/labs/toronto-ai/hGCA/.
