Table of Contents
Fetching ...

Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images

Sanayya A, Amoolya Shetty, Abhijeet Sharma, Venkatesh Ravichandran, Masthan Wali Gosuvarapalli, Sarthak Jain, Priyamvada Nanjundiah, Ujjal Kr Dutta, Divya Sharma

TL;DR

The paper addresses inaccurate ground-truth data in supervised crop classification, a problem amplified by manual GT collection in India. It introduces a multi-level GT cleaning framework that leverages multi-temporal Sentinel-2 data, NDVI temporal profiles, spectral embeddings, clustering, FCC verification, and distance-based cross-district validation. A Random Forest classifier trained on cleaned GT achieves markedly higher F1 scores (mustard ~0.92, paddy ~0.97, wheat ~0.94) compared with unclean GT (approx. 0.26–0.53), illustrating the impact of GT quality. The approach enables more reliable crop classification and has practical implications for agricultural decision-making and policy, including lending underwriting.

Abstract

In agricultural management, precise Ground Truth (GT) data is crucial for accurate Machine Learning (ML) based crop classification. Yet, issues like crop mislabeling and incorrect land identification are common. We propose a multi-level GT cleaning framework while utilizing multi-temporal Sentinel-2 data to address these issues. Specifically, this framework utilizes generating embeddings for farmland, clustering similar crop profiles, and identification of outliers indicating GT errors. We validated clusters with False Colour Composite (FCC) checks and used distance-based metrics to scale and automate this verification process. The importance of cleaning the GT data became apparent when the models were trained on the clean and unclean data. For instance, when we trained a Random Forest model with the clean GT data, we achieved upto 70\% absolute percentage points higher for the F1 score metric. This approach advances crop classification methodologies, with potential for applications towards improving loan underwriting and agricultural decision-making.

Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images

TL;DR

The paper addresses inaccurate ground-truth data in supervised crop classification, a problem amplified by manual GT collection in India. It introduces a multi-level GT cleaning framework that leverages multi-temporal Sentinel-2 data, NDVI temporal profiles, spectral embeddings, clustering, FCC verification, and distance-based cross-district validation. A Random Forest classifier trained on cleaned GT achieves markedly higher F1 scores (mustard ~0.92, paddy ~0.97, wheat ~0.94) compared with unclean GT (approx. 0.26–0.53), illustrating the impact of GT quality. The approach enables more reliable crop classification and has practical implications for agricultural decision-making and policy, including lending underwriting.

Abstract

In agricultural management, precise Ground Truth (GT) data is crucial for accurate Machine Learning (ML) based crop classification. Yet, issues like crop mislabeling and incorrect land identification are common. We propose a multi-level GT cleaning framework while utilizing multi-temporal Sentinel-2 data to address these issues. Specifically, this framework utilizes generating embeddings for farmland, clustering similar crop profiles, and identification of outliers indicating GT errors. We validated clusters with False Colour Composite (FCC) checks and used distance-based metrics to scale and automate this verification process. The importance of cleaning the GT data became apparent when the models were trained on the clean and unclean data. For instance, when we trained a Random Forest model with the clean GT data, we achieved upto 70\% absolute percentage points higher for the F1 score metric. This approach advances crop classification methodologies, with potential for applications towards improving loan underwriting and agricultural decision-making.

Paper Structure

This paper contains 7 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Level 1 (L1): Overall processing and mislabeling, focusing on plots mapped to non-agricultural use cases, excessive overlap of multiple ground truth (GT) polygons, and overlaps with roads or built structures
  • Figure 2: Level 2 (L2): Cleaning GT data using VI temporal profiles. The sheer intertwining of profiles seen here is untangled by isolating those profiles that fall below a prescribed VI value.
  • Figure 3: Level 3 (L3): Clustering profiles highlight profiles with low variance as well as noisy profiles.