SERA-H: Beyond Native Sentinel Spatial Limits for High-Resolution Canopy Height Mapping
Thomas Boudras, Martin Schwartz, Rasmus Fensholt, Martin Brandt, Ibrahim Fayad, Jean-Pierre Wigneron, Gabriel Belouze, Fajwel Fogel, Philippe Ciais
TL;DR
SERA-H introduces an end-to-end framework that surpasses the native 10 m Sentinel resolution by fusing a trainable super-resolution module (EDSR) with a temporal attention regression (UTAE) to predict 2.5 m canopy height maps. Trained with dense ALS supervision via the Open-Canopy dataset, it leverages Sentinel-1/2 time series to reconstruct fine forest structure, achieving MAE around 2.6 m and high Tree Cover IoU. Ablation studies show the critical roles of both the learnable upsampling and temporal modeling, while benchmarking demonstrates competitiveness with, and in some cases parity to, methods using higher-resolution or commercial imagery. The method enables freely accessible, high-frequency forest mapping, though limitations remain in resolving very fine structures and in domain transfer to data-scarce biomes. Overall, SERA-H offers a practical path to accurate, high-resolution canopy height mapping using open data and end-to-end learning.
Abstract
High-resolution mapping of canopy height is essential for forest management and biodiversity monitoring. Although recent studies have led to the advent of deep learning methods using satellite imagery to predict height maps, these approaches often face a trade-off between data accessibility and spatial resolution. To overcome these limitations, we present SERA-H, an end-to-end model combining a super-resolution module (EDSR) and temporal attention encoding (UTAE). Trained under the supervision of high-density LiDAR data (ALS), our model generates 2.5 m resolution height maps from freely available Sentinel-1 and Sentinel-2 (10 m) time series data. Evaluated on an open-source benchmark dataset in France, SERA-H, with a MAE of 2.6 m and a coefficient of determination of 0.82, not only outperforms standard Sentinel-1/2 baselines but also achieves performance comparable to or better than methods relying on commercial very high-resolution imagery (SPOT-6/7, PlanetScope, Maxar). These results demonstrate that combining high-resolution supervision with the spatiotemporal information embedded in time series enables the reconstruction of details beyond the input sensors' native resolution. SERA-H opens the possibility of freely mapping forests with high revisit frequency, achieving accuracy comparable to that of costly commercial imagery. The source code is available at https://github.com/ThomasBoudras/SERA-H#
