Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data
Tarun Kalluri, Jihyeon Lee, Kihyuk Sohn, Sahil Singla, Manmohan Chandraker, Joseph Xu, Jeremiah Liu
TL;DR
This work tackles the problem of poor cross-domain robustness in aerial disaster assessment when post-disaster labeled data are scarce. It introduces a scalable pipeline that uses mask-guided text-to-image editing (via the MUSE model) to synthesize post-disaster imagery conditioned on target-domain pre-disaster images, paired with a simple two-stage training regime that leverages source-domain labels and synthetic target data. Empirical results on xBD and SKAI demonstrate significant improvements over source-only baselines in both single-source and multi-source transfer settings, with gains up to roughly 29% in AUPRC on challenging cross-geography transfers. The approach enables rapid, low-cost generation of target-domain supervision and yields practical robustness gains for disaster response in under-resourced geographies, while acknowledging sensitivity to the quality of generated imagery and potential benefits from more advanced filtering and domain-specific generator tuning.
Abstract
We present a simple and efficient method to leverage emerging text-to-image generative models in creating large-scale synthetic supervision for the task of damage assessment from aerial images. While significant recent advances have resulted in improved techniques for damage assessment using aerial or satellite imagery, they still suffer from poor robustness to domains where manual labeled data is unavailable, directly impacting post-disaster humanitarian assistance in such under-resourced geographies. Our contribution towards improving domain robustness in this scenario is two-fold. Firstly, we leverage the text-guided mask-based image editing capabilities of generative models and build an efficient and easily scalable pipeline to generate thousands of post-disaster images from low-resource domains. Secondly, we propose a simple two-stage training approach to train robust models while using manual supervision from different source domains along with the generated synthetic target domain data. We validate the strength of our proposed framework under cross-geography domain transfer setting from xBD and SKAI images in both single-source and multi-source settings, achieving significant improvements over a source-only baseline in each case.
