Deep Learning for Spatio-Temporal Fusion in Land Surface Temperature Estimation: A Comprehensive Survey, Experimental Analysis, and Future Trends
Sofiane Bouaziz, Adel Hafiane, Raphael Canals, Rachid Nedjai
TL;DR
This survey formalizes the spatio-temporal fusion task for Land Surface Temperature (LST) and surveys deep learning approaches across CNNs, autoencoders, GANs, and transformers, highlighting how LST’s rapid temporal dynamics and sharp spatial gradients challenge SR-based STF methods. It introduces STF-LST, a public MODIS-Landsat LST dataset (51 pairs, 2013–2024) to benchmark methods and reveal gaps in current DL architectures, including generalization, cloud-gap handling, and physical consistency. The study finds that DL methods initially designed for SR struggle to generalize to LST, with average RMSE often exceeding 3°C and notable artifacts or oversmoothing, underscoring the need for joint spatio-temporal models, robust gap handling, and physics-informed losses. The work outlines future directions, including unified spatio-temporal networks, pretrained fusion models, higher-resolution guidance, and potential LLM-assisted semantic augmentation, to advance practically reliable LST fusion for climate and urban applications.
Abstract
Land Surface Temperature (LST) plays a key role in climate monitoring, urban heat assessment, and land-atmosphere interactions. However, current thermal infrared satellite sensors cannot simultaneously achieve high spatial and temporal resolution. Spatio-temporal fusion (STF) techniques address this limitation by combining complementary satellite data, one with high spatial but low temporal resolution, and another with high temporal but low spatial resolution. Existing STF techniques, from classical models to modern deep learning (DL) architectures, were primarily developed for surface reflectance (SR). Their application to thermal data remains limited and often overlooks LST-specific spatial and temporal variability. This study provides a focused review of DL-based STF methods for LST. We present a formal mathematical definition of the thermal fusion task, propose a refined taxonomy of relevant DL methods, and analyze the modifications required when adapting SR-oriented models to LST. To support reproducibility and benchmarking, we introduce a new dataset comprising 51 Terra MODIS-Landsat LST pairs from 2013 to 2024, and evaluate representative models to explore their behavior on thermal data. The analysis highlights performance gaps, architecture sensitivities, and open research challenges. The dataset and accompanying resources are publicly available at https://github.com/Sofianebouaziz1/STF-LST.
