Digital Elevation Model Estimation from RGB Satellite Imagery using Generative Deep Learning
Alif Ilham Madani, Riska A. Kuswati, Alex M. Lechner, Muhamad Risqi U. Saputra
TL;DR
Problem: generating DEMs from RGB satellite imagery in data-sparse settings. Approach: a two-stage pix2pix conditional GAN trained on a global Landsat-SRTM RGB-DEM dataset with cloud-free preprocessing and SSIM-guided sample filtering. Key findings: baseline RMSE 0.4876 and SSIM 0.1896; after SSIM≥0.2 filtering, RMSE improves to 0.4671 and SSIM to 0.2065, with strongest performance in mountainous regions but struggles in lowland/residential areas. Significance: demonstrates a cost-effective alternative to LiDAR/stereo methods for global DEM estimation and highlights ongoing challenges in generalization across terrains. Future work: integrating multispectral or radar data to improve performance in urban environments.
Abstract
Digital Elevation Models (DEMs) are vital datasets for geospatial applications such as hydrological modeling and environmental monitoring. However, conventional methods to generate DEM, such as using LiDAR and photogrammetry, require specific types of data that are often inaccessible in resource-constrained settings. To alleviate this problem, this study proposes an approach to generate DEM from freely available RGB satellite imagery using generative deep learning, particularly based on a conditional Generative Adversarial Network (GAN). We first developed a global dataset consisting of 12K RGB-DEM pairs using Landsat satellite imagery and NASA's SRTM digital elevation data, both from the year 2000. A unique preprocessing pipeline was implemented to select high-quality, cloud-free regions and aggregate normalized RGB composites from Landsat imagery. Additionally, the model was trained in a two-stage process, where it was first trained on the complete dataset and then fine-tuned on high-quality samples filtered by Structural Similarity Index Measure (SSIM) values to improve performance on challenging terrains. The results demonstrate promising performance in mountainous regions, achieving an overall mean root-mean-square error (RMSE) of 0.4671 and a mean SSIM score of 0.2065 (scale -1 to 1), while highlighting limitations in lowland and residential areas. This study underscores the importance of meticulous preprocessing and iterative refinement in generative modeling for DEM generation, offering a cost-effective and adaptive alternative to conventional methods while emphasizing the challenge of generalization across diverse terrains worldwide.
