Table of Contents
Fetching ...

A Deep Learning Architecture for Land Cover Mapping Using Spatio-Temporal Sentinel-1 Features

Luigi Russo, Antonietta Sorriso, Silvia Liberata Ullo, Paolo Gamba

TL;DR

This work tackles land cover mapping in environments where optical data are hindered by clouds by proposing a SAR-only pipeline that uses Sentinel-1 VH data organized into four seasonal composites. A transformer-based Swin-Unet architecture processes 28 spatial features derived from these seasonal images to classify LC at 10 m resolution, demonstrated across Africa, Amazonia, and Siberia. The key innovations are the seasonal synthesized spatio-temporal features, all-SAR feature extraction, and global HR LC mapping with strong cross-ecoregion generalization, achieving OA values up to 0.97 and outperforming CNN-based baselines. The method supports climate-related LC monitoring within the ESA CCI+ HR LC framework and offers a scalable, cloud-resilient tool for all-weather land cover mapping with robust performance across diverse ecosystems.

Abstract

Land Cover (LC) mapping using satellite imagery is critical for environmental monitoring and management. Deep Learning (DL), particularly Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), have revolutionized this field by enhancing the accuracy of classification tasks. In this work, a novel approach combining a transformer-based Swin-Unet architecture with seasonal synthesized spatio-temporal images has been employed to classify LC types using spatio-temporal features extracted from Sentinel-1 (S1) Synthetic Aperture Radar (SAR) data, organized into seasonal clusters. The study focuses on three distinct regions - Amazonia, Africa, and Siberia - and evaluates the model performance across diverse ecoregions within these areas. By utilizing seasonal feature sequences instead of dense temporal sequences, notable performance improvements have been achieved, especially in regions with temporal data gaps like Siberia, where S1 data distribution is uneven and non-uniform. The results demonstrate the effectiveness and the generalization capabilities of the proposed methodology in achieving high overall accuracy (O.A.) values, even in regions with limited training data.

A Deep Learning Architecture for Land Cover Mapping Using Spatio-Temporal Sentinel-1 Features

TL;DR

This work tackles land cover mapping in environments where optical data are hindered by clouds by proposing a SAR-only pipeline that uses Sentinel-1 VH data organized into four seasonal composites. A transformer-based Swin-Unet architecture processes 28 spatial features derived from these seasonal images to classify LC at 10 m resolution, demonstrated across Africa, Amazonia, and Siberia. The key innovations are the seasonal synthesized spatio-temporal features, all-SAR feature extraction, and global HR LC mapping with strong cross-ecoregion generalization, achieving OA values up to 0.97 and outperforming CNN-based baselines. The method supports climate-related LC monitoring within the ESA CCI+ HR LC framework and offers a scalable, cloud-resilient tool for all-weather land cover mapping with robust performance across diverse ecosystems.

Abstract

Land Cover (LC) mapping using satellite imagery is critical for environmental monitoring and management. Deep Learning (DL), particularly Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), have revolutionized this field by enhancing the accuracy of classification tasks. In this work, a novel approach combining a transformer-based Swin-Unet architecture with seasonal synthesized spatio-temporal images has been employed to classify LC types using spatio-temporal features extracted from Sentinel-1 (S1) Synthetic Aperture Radar (SAR) data, organized into seasonal clusters. The study focuses on three distinct regions - Amazonia, Africa, and Siberia - and evaluates the model performance across diverse ecoregions within these areas. By utilizing seasonal feature sequences instead of dense temporal sequences, notable performance improvements have been achieved, especially in regions with temporal data gaps like Siberia, where S1 data distribution is uneven and non-uniform. The results demonstrate the effectiveness and the generalization capabilities of the proposed methodology in achieving high overall accuracy (O.A.) values, even in regions with limited training data.

Paper Structure

This paper contains 20 sections, 1 equation, 19 figures, 6 tables.

Figures (19)

  • Figure 1: Overview of the employed Swin-UNet architecture described in Section \ref{['sec:swin_unet']} for the LC classification task. Distinctive aspects of the Swin Transformer architecture are combined with the UNet architecture to achieve optimal performance.
  • Figure 2: A simplified workflow diagram of the proposed mapping procedure applied to SAR temporal sequences. The pre-processing part was done using a SNAP graph, as explained in Section \ref{['subsec:data_preprocessing']}. The multitemporal speckle noise reducer, feature extraction, training and validation set generation are also described in Section \ref{['sec:methods']}, while the DL-based block and the results are discussed in Section \ref{['sec:results']}.
  • Figure 3: Block diagram of S1 data pre-processing.
  • Figure 4: Multitemporal despeckle flowchart applied to the S1 temporal sequence. The temporal averaging of the SAR time series produces the super image $\hat{u}_m$. The super image is used to form the ratio image $\tau_t$, given by the image $\upsilon_t$ at time $t$ and the super image $\hat{u}_m$, for each pixel of the S1 images. The Lee filter is then applied to $\tau_t$ because the super image $\hat{u}_m$ suffers from speckle (although the speckle in the super image is greatly reduced), resulting in the image $\hat{\rho}_m$. In the last step, the restored image $\hat{u}_t$ is obtained by multiplying the denoised ratio image with the super image.
  • Figure 5: Enlargements of the test areas: (a) Amazonia (62.1014$^\circ$ W, 23.5983$^\circ$ S : 42.9441$^\circ$ W, 0$^\circ$ N, WGS 84), (b) Africa (9.8986$^\circ$ E, 0.0885$^\circ$ S : 43.2908$^\circ$ E, 18.0891$^\circ$ N, WGS 84) and (c) Siberia (64.4361$^\circ$ E, 51.2789$^\circ$ N : 93.4017$^\circ$ E, 75.6847$^\circ$ N, WGS 84).
  • ...and 14 more figures