Table of Contents
Fetching ...

Sampling Strategies based on Wisdom of Crowds for Amazon Deforestation Detection

Hugo Resende, Eduardo B. Neto, Fabio A. M. Cappabianco, Alvaro L. Fazenda, Fabio A. Faria

TL;DR

This work tackles deforestation detection in the Brazilian Amazon by integrating ForestEyes crowd-labeled data with Haralick texture features from Sentinel-2 and an SVM classifier. It introduces entropy-based sampling strategies, particularly an increasing (low-entropy-first) approach, to select training samples from crowd labels, achieving better balanced accuracy with smaller training sets and faster convergence than random sampling. The study demonstrates a concrete link between label uncertainty (entropy) and segment quality (HoR), using 72 texture features to train a linear SVM and achieve robust deforestation detection. The findings highlight the practical potential of crowd-powered sampling to enhance remote sensing-based monitoring and enable more efficient alerting for forest conservation efforts.

Abstract

Conserving tropical forests is highly relevant socially and ecologically because of their critical role in the global ecosystem. However, the ongoing deforestation and degradation affect millions of hectares each year, necessitating government or private initiatives to ensure effective forest monitoring. In April 2019, a project based on Citizen Science and Machine Learning models called ForestEyes (FE) was launched with the aim of providing supplementary data to assist experts from government and non-profit organizations in their deforestation monitoring efforts. Recent research has shown that labeling FE project volunteers/citizen scientists helps tailor machine learning models. In this sense, we adopt the FE project to create different sampling strategies based on the wisdom of crowds to select the most suitable samples from the training set to learn an SVM technique and obtain better classification results in deforestation detection tasks. In our experiments, we can show that our strategy based on user entropy-increasing achieved the best classification results in the deforestation detection task when compared with the random sampling strategies, as well as, reducing the convergence time of the SVM technique.

Sampling Strategies based on Wisdom of Crowds for Amazon Deforestation Detection

TL;DR

This work tackles deforestation detection in the Brazilian Amazon by integrating ForestEyes crowd-labeled data with Haralick texture features from Sentinel-2 and an SVM classifier. It introduces entropy-based sampling strategies, particularly an increasing (low-entropy-first) approach, to select training samples from crowd labels, achieving better balanced accuracy with smaller training sets and faster convergence than random sampling. The study demonstrates a concrete link between label uncertainty (entropy) and segment quality (HoR), using 72 texture features to train a linear SVM and achieve robust deforestation detection. The findings highlight the practical potential of crowd-powered sampling to enhance remote sensing-based monitoring and enable more efficient alerting for forest conservation efforts.

Abstract

Conserving tropical forests is highly relevant socially and ecologically because of their critical role in the global ecosystem. However, the ongoing deforestation and degradation affect millions of hectares each year, necessitating government or private initiatives to ensure effective forest monitoring. In April 2019, a project based on Citizen Science and Machine Learning models called ForestEyes (FE) was launched with the aim of providing supplementary data to assist experts from government and non-profit organizations in their deforestation monitoring efforts. Recent research has shown that labeling FE project volunteers/citizen scientists helps tailor machine learning models. In this sense, we adopt the FE project to create different sampling strategies based on the wisdom of crowds to select the most suitable samples from the training set to learn an SVM technique and obtain better classification results in deforestation detection tasks. In our experiments, we can show that our strategy based on user entropy-increasing achieved the best classification results in the deforestation detection task when compared with the random sampling strategies, as well as, reducing the convergence time of the SVM technique.
Paper Structure (14 sections, 2 equations, 5 figures)

This paper contains 14 sections, 2 equations, 5 figures.

Figures (5)

  • Figure 1: Study Area and its respective ground truth by PRODES.
  • Figure 2: ForestEyes project's schematic representation. The light blue and orange modules correspond, respectively, to the modules implemented in this work.
  • Figure 3: Examples of segments (in red color) with different $HoR$ and entropy values, showing that the higher the $HoR$ value, the lower the entropy tends to be and vice versa.
  • Figure 4: $HoR$$\times$ Entropy values for Sentinel-2 Campaign.
  • Figure 5: Classification results of the sampling strategies for deforestation detection task.