Table of Contents
Fetching ...

Development and Application of a Sentinel-2 Satellite Imagery Dataset for Deep-Learning Driven Forest Wildfire Detection

Valeria Martin, K. Brent Venable, Derek Morgan

TL;DR

The CWGID is presented, a high-resolution bi-temporal collection of over 100,000 labeled RGB before-and-after Sentinel-2 wildfire satellite image pairs that may help address the scarcity of labeled data for DL-based forest wildfire detection, while providing a scalable resource that could support other DL applications in environmental monitoring.

Abstract

Forest loss due to natural events, such as wildfires, represents an increasing global challenge that demands advanced analytical methods for effective detection and mitigation. To this end, the integration of satellite imagery with deep learning (DL) methods has become essential. Nevertheless, this approach requires substantial amounts of labeled data to produce accurate results. In this study, we use bi-temporal Sentinel-2 satellite imagery sourced from Google Earth Engine (GEE) to build the California Wildfire GeoImaging Dataset (CWGID), a high-resolution labeled satellite imagery dataset with over 100,000 labeled before and after forest wildfire image pairs for wildfire detection through DL. Our methods include data acquisition from authoritative sources, data processing, and an initial dataset analysis using three pre-trained Convolutional Neural Network (CNN) architectures. Our results show that the EF EfficientNet-B0 model achieves the highest accuracy of over 92% in detecting forest wildfires. The CWGID and the methodology used to build it, prove to be a valuable resource for training and testing DL architectures for forest wildfire detection.

Development and Application of a Sentinel-2 Satellite Imagery Dataset for Deep-Learning Driven Forest Wildfire Detection

TL;DR

The CWGID is presented, a high-resolution bi-temporal collection of over 100,000 labeled RGB before-and-after Sentinel-2 wildfire satellite image pairs that may help address the scarcity of labeled data for DL-based forest wildfire detection, while providing a scalable resource that could support other DL applications in environmental monitoring.

Abstract

Forest loss due to natural events, such as wildfires, represents an increasing global challenge that demands advanced analytical methods for effective detection and mitigation. To this end, the integration of satellite imagery with deep learning (DL) methods has become essential. Nevertheless, this approach requires substantial amounts of labeled data to produce accurate results. In this study, we use bi-temporal Sentinel-2 satellite imagery sourced from Google Earth Engine (GEE) to build the California Wildfire GeoImaging Dataset (CWGID), a high-resolution labeled satellite imagery dataset with over 100,000 labeled before and after forest wildfire image pairs for wildfire detection through DL. Our methods include data acquisition from authoritative sources, data processing, and an initial dataset analysis using three pre-trained Convolutional Neural Network (CNN) architectures. Our results show that the EF EfficientNet-B0 model achieves the highest accuracy of over 92% in detecting forest wildfires. The CWGID and the methodology used to build it, prove to be a valuable resource for training and testing DL architectures for forest wildfire detection.
Paper Structure (15 sections, 7 figures, 1 table, 3 algorithms)

This paper contains 15 sections, 7 figures, 1 table, 3 algorithms.

Figures (7)

  • Figure 1: Flowchart of the proposed methodology. The diagram illustrates the sequential steps of the workflow followed in this study.
  • Figure 2: Representation of the Polygon Data from the FRAP. Polygons in purple represent wildfires in forested areas, used for the CWGID.
  • Figure 3: Example of a ground truth mask from the CWGID. The mask highlights wildfire-affected areas in purple and unaffected areas in yellow.
  • Figure 4: Example of 256*256px pre- and post-wildfire RGB tiles and their corresponding ground-truth mask.
  • Figure 5: Representation of a CNN with an input image, two convolutional layers, two pooling layers, one fully connected layer, and the output layer. The output has 2 different classes: damaged and undamaged, for which we show an example of classification scores.
  • ...and 2 more figures