Urban Spatio-Temporal Foundation Models for Climate-Resilient Housing: Scaling Diffusion Transformers for Disaster Risk Prediction
Olaf Yunus Laitinen Imanov, Derya Umut Kulali, Taner Yilmaz
TL;DR
Skjold-DiT introduces a diffusion-transformer-based framework that fuses multi-modal urban data with transportation-network signals to forecast building-level climate risk and generate hazard-conditioned accessibility layers for intelligent-vehicle routing. The architecture combines Norrland-Fusion for multi-modal encoding, Fjell-Prompt for cross-city zero-shot transfer, and Valkyrie-Forecast for probabilistic counterfactual simulation, validated on the BCUR dataset of 847,392 buildings across six cities. Results show strong predictive performance, robust cross-city generalization, and calibrated uncertainty, enabling scenario analysis for emergency routing and equity-focused policy. The work demonstrates practical deployment pathways with edge-cloud pipelines and policy-relevant insights aligned with World Urban Forum 13 resilience priorities, offering a scalable tool for climate-resilient urban planning and transportation optimization. \\Delta t$-dependent forecasts, hazard-conditioned routing, and explicit intervention prompts position diffusion-transformer urban foundation models as a principled approach for integrating climate science, housing vulnerability, and mobility in smart-city decision support.
Abstract
Climate hazards increasingly disrupt urban transportation and emergency-response operations by damaging housing stock, degrading infrastructure, and reducing network accessibility. This paper presents Skjold-DiT, a diffusion-transformer framework that integrates heterogeneous spatio-temporal urban data to forecast building-level climate-risk indicators while explicitly incorporating transportation-network structure and accessibility signals relevant to intelligent vehicles (e.g., emergency reachability and evacuation-route constraints). Concretely, Skjold-DiT enables hazard-conditioned routing constraints by producing calibrated, uncertainty-aware accessibility layers (reachability, travel-time inflation, and route redundancy) that can be consumed by intelligent-vehicle routing and emergency dispatch systems. Skjold-DiT combines: (1) Fjell-Prompt, a prompt-based conditioning interface designed to support cross-city transfer; (2) Norrland-Fusion, a cross-modal attention mechanism unifying hazard maps/imagery, building attributes, demographics, and transportation infrastructure into a shared latent representation; and (3) Valkyrie-Forecast, a counterfactual simulator for generating probabilistic risk trajectories under intervention prompts. We introduce the Baltic-Caspian Urban Resilience (BCUR) dataset with 847,392 building-level observations across six cities, including multi-hazard annotations (e.g., flood and heat indicators) and transportation accessibility features. Experiments evaluate prediction quality, cross-city generalization, calibration, and downstream transportation-relevant outcomes, including reachability and hazard-conditioned travel times under counterfactual interventions.
