Table of Contents
Fetching ...

Assessment of Sentinel-2 spatial and temporal coverage based on the scene classification layer

Cristhian Sanchez, Francisco Mena, Marcela Charfuelan, Marlon Nuske, Andreas Dengel

TL;DR

This work introduces a Sentinel-2 SCL-based framework to quantify spatio-temporal clean optical coverage via SC, TC, SCA, and TCA metrics for sample regions. By evaluating against ML tasks in AI4EO Enhanced Agriculture and LandCoverNet, the authors demonstrate that regions with higher clean coverage tend to yield better predictive performance, while low-coverage regions exhibit degraded accuracy. The approach offers a scalable, label-driven means to assess data quality across time series and geographies, with potential applications in data curation and curriculum learning for remote sensing models. Practically, this metric enables researchers to understand and compare region-specific data quality and to prioritize data acquisition or preprocessing accordingly.

Abstract

Since the launch of the Sentinel-2 (S2) satellites, many ML models have used the data for diverse applications. The scene classification layer (SCL) inside the S2 product provides rich information for training, such as filtering images with high cloud coverage. However, there is more potential in this. We propose a technique to assess the clean optical coverage of a region, expressed by a SITS and calculated with the S2-based SCL data. With a manual threshold and specific labels in the SCL, the proposed technique assigns a percentage of spatial and temporal coverage across the time series and a high/low assessment. By evaluating the AI4EO challenge for Enhanced Agriculture, we show that the assessment is correlated to the predictive results of ML models. The classification results in a region with low spatial and temporal coverage is worse than in a region with high coverage. Finally, we applied the technique across all continents of the global dataset LandCoverNet.

Assessment of Sentinel-2 spatial and temporal coverage based on the scene classification layer

TL;DR

This work introduces a Sentinel-2 SCL-based framework to quantify spatio-temporal clean optical coverage via SC, TC, SCA, and TCA metrics for sample regions. By evaluating against ML tasks in AI4EO Enhanced Agriculture and LandCoverNet, the authors demonstrate that regions with higher clean coverage tend to yield better predictive performance, while low-coverage regions exhibit degraded accuracy. The approach offers a scalable, label-driven means to assess data quality across time series and geographies, with potential applications in data curation and curriculum learning for remote sensing models. Practically, this metric enables researchers to understand and compare region-specific data quality and to prioritize data acquisition or preprocessing accordingly.

Abstract

Since the launch of the Sentinel-2 (S2) satellites, many ML models have used the data for diverse applications. The scene classification layer (SCL) inside the S2 product provides rich information for training, such as filtering images with high cloud coverage. However, there is more potential in this. We propose a technique to assess the clean optical coverage of a region, expressed by a SITS and calculated with the S2-based SCL data. With a manual threshold and specific labels in the SCL, the proposed technique assigns a percentage of spatial and temporal coverage across the time series and a high/low assessment. By evaluating the AI4EO challenge for Enhanced Agriculture, we show that the assessment is correlated to the predictive results of ML models. The classification results in a region with low spatial and temporal coverage is worse than in a region with high coverage. Finally, we applied the technique across all continents of the global dataset LandCoverNet.

Paper Structure

This paper contains 7 sections, 4 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Spatial and temporal coverage in the AI4EO Enhanced Agriculture dataset. Two types of filters were used: L-all-but-cloud as a cloud-removal filtering, L-veg-non-veg as a vegetated-related filtering. A 70% coverage is shown in red.
  • Figure 2: Classification results of in sample regions categorized as high and low by the with a 70% threshold. Each point represents the averaged metric on a specific sample region. The L-veg-non-veg criteria is used for the assesment.
  • Figure 3: Accuracy of in different sample regions based on L-all-but-cloud. Each point represents the averaged metric on a specific sample region. The correlation of ACC with spatial coverage is $45.0$ and with temporal coverage is $70.3$.
  • Figure 4: Spatial and temporal coverage in the LandCoverNet with the L-all-but-cloud criteria. A 50% coverage is shown in red.