Geodiffussr: Generative Terrain Texturing with Elevation Fidelity
Tai Inui, Alexander Matsumura, Edgar Simo-Serra
TL;DR
This work tackles fast, controllable terrain generation that strictly adheres to a given Digital Elevation Map (DEM) while enabling text-driven texture synthesis. It introduces Geodiffussr, a flow-matching pipeline with multi-scale content aggregation (MCA) that injects DEM features from a pretrained VGG-16 into a UNet, conditioned on text via cross-attention, and augmented by upscaling for rendering. The key finding is that full MCA substantially improves perceptual texture quality and elevation-texture alignment (e.g., FID 10.29, LPIPS 0.066, ΔdCor 0.0016) compared to non-MCA baselines, establishing a strong baseline for 2.5D terrain ideation and previz. The work also provides a biome-diverse DEM–satellite dataset and discusses practical paths to production-scale resolutions, positioning the approach as complementary to physical terrain and ecosystem models.
Abstract
Large-scale terrain generation remains a labor-intensive task in computer graphics. We introduce Geodiffussr, a flow-matching pipeline that synthesizes text-guided texture maps while strictly adhering to a supplied Digital Elevation Map (DEM). The core mechanism is multi-scale content aggregation (MCA): DEM features from a pretrained encoder are injected into UNet blocks at multiple resolutions to enforce global-to-local elevation consistency. Compared with a non-MCA baseline, MCA markedly improves visual fidelity and strengthens height-appearance coupling (FID $\downarrow$ 49.16%, LPIPS $\downarrow$ 32.33%, $Δ$dCor $\downarrow$ to 0.0016). To train and evaluate Geodiffussr, we assemble a globally distributed, biome- and climate-stratified corpus of triplets pairing SRTM-derived DEMs with Sentinel-2 imagery and vision-grounded natural-language captions that describe visible land cover. We position Geodiffussr as a strong baseline and step toward controllable 2.5D landscape generation for coarse-scale ideation and previz, complementary to physically based terrain and ecosystem simulators.
