A Deep Learning Architecture for Land Cover Mapping Using Spatio-Temporal Sentinel-1 Features
Luigi Russo, Antonietta Sorriso, Silvia Liberata Ullo, Paolo Gamba
TL;DR
This work tackles land cover mapping in environments where optical data are hindered by clouds by proposing a SAR-only pipeline that uses Sentinel-1 VH data organized into four seasonal composites. A transformer-based Swin-Unet architecture processes 28 spatial features derived from these seasonal images to classify LC at 10 m resolution, demonstrated across Africa, Amazonia, and Siberia. The key innovations are the seasonal synthesized spatio-temporal features, all-SAR feature extraction, and global HR LC mapping with strong cross-ecoregion generalization, achieving OA values up to 0.97 and outperforming CNN-based baselines. The method supports climate-related LC monitoring within the ESA CCI+ HR LC framework and offers a scalable, cloud-resilient tool for all-weather land cover mapping with robust performance across diverse ecosystems.
Abstract
Land Cover (LC) mapping using satellite imagery is critical for environmental monitoring and management. Deep Learning (DL), particularly Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), have revolutionized this field by enhancing the accuracy of classification tasks. In this work, a novel approach combining a transformer-based Swin-Unet architecture with seasonal synthesized spatio-temporal images has been employed to classify LC types using spatio-temporal features extracted from Sentinel-1 (S1) Synthetic Aperture Radar (SAR) data, organized into seasonal clusters. The study focuses on three distinct regions - Amazonia, Africa, and Siberia - and evaluates the model performance across diverse ecoregions within these areas. By utilizing seasonal feature sequences instead of dense temporal sequences, notable performance improvements have been achieved, especially in regions with temporal data gaps like Siberia, where S1 data distribution is uneven and non-uniform. The results demonstrate the effectiveness and the generalization capabilities of the proposed methodology in achieving high overall accuracy (O.A.) values, even in regions with limited training data.
