Assessment of Sentinel-2 spatial and temporal coverage based on the scene classification layer
Cristhian Sanchez, Francisco Mena, Marcela Charfuelan, Marlon Nuske, Andreas Dengel
TL;DR
This work introduces a Sentinel-2 SCL-based framework to quantify spatio-temporal clean optical coverage via SC, TC, SCA, and TCA metrics for sample regions. By evaluating against ML tasks in AI4EO Enhanced Agriculture and LandCoverNet, the authors demonstrate that regions with higher clean coverage tend to yield better predictive performance, while low-coverage regions exhibit degraded accuracy. The approach offers a scalable, label-driven means to assess data quality across time series and geographies, with potential applications in data curation and curriculum learning for remote sensing models. Practically, this metric enables researchers to understand and compare region-specific data quality and to prioritize data acquisition or preprocessing accordingly.
Abstract
Since the launch of the Sentinel-2 (S2) satellites, many ML models have used the data for diverse applications. The scene classification layer (SCL) inside the S2 product provides rich information for training, such as filtering images with high cloud coverage. However, there is more potential in this. We propose a technique to assess the clean optical coverage of a region, expressed by a SITS and calculated with the S2-based SCL data. With a manual threshold and specific labels in the SCL, the proposed technique assigns a percentage of spatial and temporal coverage across the time series and a high/low assessment. By evaluating the AI4EO challenge for Enhanced Agriculture, we show that the assessment is correlated to the predictive results of ML models. The classification results in a region with low spatial and temporal coverage is worse than in a region with high coverage. Finally, we applied the technique across all continents of the global dataset LandCoverNet.
