Table of Contents
Fetching ...

AgriPotential: A Novel Multi-Spectral and Multi-Temporal Remote Sensing Dataset for Agricultural Potentials

Mohammad El Sakka, Caroline De Pourtales, Lotfi Chaari, Josiane Mothe

TL;DR

AgriPotential addresses the lack of public resources for agricultural potential prediction by introducing a large, multi-spectral, multi-temporal Sentinel-2 dataset with pixel-level labels for three crop types across five potential classes. It combines ground-truth BD Sol-GDPA annotations with careful preprocessing (cloud-filtered monthly images, 5 m super-resolution, and 128×128 patches) to enable diverse tasks including ordinal regression, multi-label classification, and spatio-temporal modeling. The paper provides thorough data statistics, label co-occurrence analyses, and baseline experiments using a UNet to validate feasibility and highlight the benefits of ordinal labeling. This dataset is poised to advance data-driven sustainable land-use planning and can be extended with additional remote-sensing and environmental data for richer modeling.

Abstract

Remote sensing has emerged as a critical tool for large-scale Earth monitoring and land management. In this paper, we introduce AgriPotential, a novel benchmark dataset composed of Sentinel-2 satellite imagery captured over multiple months. The dataset provides pixel-level annotations of agricultural potentials for three major crop types - viticulture, market gardening, and field crops - across five ordinal classes. AgriPotential supports a broad range of machine learning tasks, including ordinal regression, multi-label classification, and spatio-temporal modeling. The data cover diverse areas in Southern France, offering rich spectral information. AgriPotential is the first public dataset designed specifically for agricultural potential prediction, aiming to improve data-driven approaches to sustainable land use planning. The dataset and the code are freely accessible at: https://zenodo.org/records/15551829

AgriPotential: A Novel Multi-Spectral and Multi-Temporal Remote Sensing Dataset for Agricultural Potentials

TL;DR

AgriPotential addresses the lack of public resources for agricultural potential prediction by introducing a large, multi-spectral, multi-temporal Sentinel-2 dataset with pixel-level labels for three crop types across five potential classes. It combines ground-truth BD Sol-GDPA annotations with careful preprocessing (cloud-filtered monthly images, 5 m super-resolution, and 128×128 patches) to enable diverse tasks including ordinal regression, multi-label classification, and spatio-temporal modeling. The paper provides thorough data statistics, label co-occurrence analyses, and baseline experiments using a UNet to validate feasibility and highlight the benefits of ordinal labeling. This dataset is poised to advance data-driven sustainable land-use planning and can be extended with additional remote-sensing and environmental data for richer modeling.

Abstract

Remote sensing has emerged as a critical tool for large-scale Earth monitoring and land management. In this paper, we introduce AgriPotential, a novel benchmark dataset composed of Sentinel-2 satellite imagery captured over multiple months. The dataset provides pixel-level annotations of agricultural potentials for three major crop types - viticulture, market gardening, and field crops - across five ordinal classes. AgriPotential supports a broad range of machine learning tasks, including ordinal regression, multi-label classification, and spatio-temporal modeling. The data cover diverse areas in Southern France, offering rich spectral information. AgriPotential is the first public dataset designed specifically for agricultural potential prediction, aiming to improve data-driven approaches to sustainable land use planning. The dataset and the code are freely accessible at: https://zenodo.org/records/15551829

Paper Structure

This paper contains 18 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Overview of the AgriPotential dataset end-to-end construction pipeline. The process begins with the selection of Sentinel-2 images from the GEODES CNES Portal, filtered to retain only those with minimal cloud cover. Ground truth annotations are sourced from the BD SOL - GDPA database and filtered to include only the highest confidence levels. These annotations are provided in vector format, with polygons representing spatial regions labeled by agricultural potential. The coordinate reference system (CRS) of the ground truth is then reprojected to match that of the Sentinel-2 images to ensure an accurate spatial alignment. The two data sources are then stacked and a grid of tiles is overlaid, which are then divided into training, validation, and testing subsets. Finally, patches are extracted from the tiles to create the final dataset.
  • Figure 2: Hérault Department in South of France.
  • Figure 3: Agricultural potential classes ranging from "Very low" to "Very high", represented over a selected patch in the Hérault region. The colored map shows potential levels for viticulture.
  • Figure 4: The hierarchical file structure of the AgriPotential Dataset.
  • Figure 5: Class distribution across the three crop types. The class frequencies reveal imbalanced representations within each crop type.
  • ...and 2 more figures