Table of Contents
Fetching ...

The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series

Jan Hemmerling, Marcel Schwieder, Philippe Rufin, Leon-Friedrich Thomas, Mirela Tulbure, Patrick Hostert, Stefan Erasmi

Abstract

Organic farming is a key element in achieving more sustainable agriculture. For a better understanding of the development and impact of organic farming, comprehensive, spatially explicit information is needed. This study presents an approach for the discrimination of organic and conventional farming systems using intra-annual Sentinel-2 time series. In addition, it examines two factors influencing this discrimination: the joint learning of crop type information in a concurrent task and the role of spatial context. A Vision Transformer model based on the Temporo-Spatial Vision Transformer (TSViT) architecture was used to construct a classification model for the two farming systems. The model was extended for simultaneous learning of the crop type, creating a multitask learning setting. By varying the patch size presented to the model, we tested the influence of spatial context on the classification accuracy of both tasks. We show that discrimination between organic and conventional farming systems using multispectral remote sensing data is feasible. However, classification performance varies substantially across crop types. For several crops, such as winter rye, winter wheat, and winter oat, F1 scores of 0.8 or higher can be achieved. In contrast, other agricultural land use classes, such as permanent grassland, orchards, grapevines, and hops, cannot be reliably distinguished, with F1 scores for the organic management class of 0.4 or lower. Joint learning of farming system and crop type provides only limited additional benefits over single-task learning. In contrast, incorporating wider spatial context improves the performance of both farming system and crop type classification. Overall, we demonstrate that a classification of agricultural farming systems is possible in a diverse agricultural region using multispectral remote sensing data.

The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series

Abstract

Organic farming is a key element in achieving more sustainable agriculture. For a better understanding of the development and impact of organic farming, comprehensive, spatially explicit information is needed. This study presents an approach for the discrimination of organic and conventional farming systems using intra-annual Sentinel-2 time series. In addition, it examines two factors influencing this discrimination: the joint learning of crop type information in a concurrent task and the role of spatial context. A Vision Transformer model based on the Temporo-Spatial Vision Transformer (TSViT) architecture was used to construct a classification model for the two farming systems. The model was extended for simultaneous learning of the crop type, creating a multitask learning setting. By varying the patch size presented to the model, we tested the influence of spatial context on the classification accuracy of both tasks. We show that discrimination between organic and conventional farming systems using multispectral remote sensing data is feasible. However, classification performance varies substantially across crop types. For several crops, such as winter rye, winter wheat, and winter oat, F1 scores of 0.8 or higher can be achieved. In contrast, other agricultural land use classes, such as permanent grassland, orchards, grapevines, and hops, cannot be reliably distinguished, with F1 scores for the organic management class of 0.4 or lower. Joint learning of farming system and crop type provides only limited additional benefits over single-task learning. In contrast, incorporating wider spatial context improves the performance of both farming system and crop type classification. Overall, we demonstrate that a classification of agricultural farming systems is possible in a diverse agricultural region using multispectral remote sensing data.

Paper Structure

This paper contains 11 sections, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Location of our study area in Germany, Zoom-in (1): grid system with 30 × 30 km tiles, indicated are training, validation and test tiles; Zoom-in (2): Subdivision of tiles into 300 × 300 m patches (30 × 30 pixel); outlined are patches with more than 15% agricultural land as used for training and validation, further subdivision of the patches into 100 × 100 m (10 × 10 pixel) and 20× 20 m (2 × 2 pixel), Background: Google Earth
  • Figure 2: Overview of the adapted Temporo-Spatial Vision Transformer (TSViT), Pixel are represented by grey grid lines, colored boxes represent sub-patches (1) Input patches are arranged into n sub-patches (PP) of size S. For each of the 36 time steps (T) 10 bands (B) are embedded into tokens of length d (2) In the temporal transformer stage, tokens interact across T and the concatenated class tokens (3) In the spatial transformer stage, class tokens interact only with their corresponding spatial class token. Final predictions for crop type and management system are produced via separate argmax layers.
  • Figure 3: Mean F1-score derived from validation data set during model training after each epoch.
  • Figure 4: Crop type and management system predictions from model runs with different spatial input dimensions (30 x 30, 10 x 10, 2 x 2 pixel, as described in 2.3, IACS data as reference, Random Forest model predictions as described in 2.8)
  • Figure 5: Crop type and management form predictions from model runs with different spatial input dimensions (30 x 30, 10 x 10, 2 x 2 pixel, as described in 2.3, IACS data as reference, Random Forest model predictions as described in 2.8)